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ABSTRACT 

In the second volume of a four* volume annual report 
on \:he Northeast Academic Science Information Center (NASIC), two 
developmental studies are reported* The first documents an 
experimental, pilot operation of computer-based reference search 
services to users on a f ee^for^service basis initiated at 
Massachusetts Institute of Technology as the first node of the NASIC 
program* The second is a study of effectiveness and cost 
effectiveness considerations for NASIC information services 
operation. The study reviewed cost factors relating to the selection 
of the data bases, the performance o£ information retrieval and 
dissemination systems, the choice of service centers, NASIC productf; 
and services, software considerations, communications and networking 
aspects, and management of information service center operation* A 
lengthy review of the status of on line interactive retrieval system 
is attached. (JY) 



ERIC 



NORTHEAST ACADEMIC SCIENCE 
INFO;<HATION CENTER 
(NASIC) 



PHASE I REPORT 
(March 1973 - February 1974) 



VOLUME 2 



NATlONAl. INSTITUTE OF 

riOnjVtN^ MA^ BtErj fJtPfrO 

iJUjiNKO^ A' Of? OPif^ro^s 
,f^^T 0" ici-^ MAMONiL iM^nruTE; OF 

E n .rAT ON ^'J^iT.ON 0» POLICY 



Submitted in ]ieu of the 4th Quarterly Progress Report to the Office of 
Science Information Service, National Science Foundation by the New England 
Board of Higher Education, Wetlesley. Massachusetts, under Grant No GN-37296 
May 1974. 



TABLE OF CONTENTS 



APPENDIX A " NASIC at MIT, Phase I Report - A.R. Benenfeld, M.E. Pensyl, 
R.S. Marcus, J.F. Reintjes, Electronic Systems Laboratory, 
Massachusetts Institute of Technology, March 1974. 

APPENDIX B Effectiveness and Cost-Effectiveness Considerations for 
NASIC Information Services Operation - J.W. Kuipers, 
F.W. Lancaster and R.W. Thorpe, QEI, Inc., October 1973. 



ERIC 



March 29, 1974 



Report ESL-R-543 



WASIC AT HIT 



PHASE 1 REPORT 



16 JULY 1973 - 28 FEBRUARY 1974 



by 

Alan R* Benenfeld 
Mary E* Pensyl 
Richard S* Marcus 
J* Francis Reintjes 



The research reported in this document was performed under contract 
to-.^e Wew England Board of Higher Education in connection with 
thei\ WASIC Program funded by the National Science Foundation* 



Electronic Systems Laboratory 
Department of Electrical Engineering 
Massachusetts Institute of Technology 
Cambridge, Massachusetts 02139 



ABSTRACT 



An experimental, pilot Operation of computer-based refei^encc search ser- 
vices to users on a fee- for- service basis was initiated at M*I.T. as the 
drstnode in the development of the Northeast Academic Science Informa- 
tion Center (nASIC) under a New England Board of Higher Education (nebhe) 
program* The development encompassed, among other tasks, selection of 
services ^ training for services, developing the initial organizational 
and operational policies and capabilities, pioblicity about available 
services, and the operations monitoring procedures, A fundamental philos- 
ophy is to integrate these services within the library environment where 
they complement traditional services* Initial experiences during a three 
month operational period. show that (1) a demand exists for computer-based 
reference search services; (2) users are willing to pay, even out-of- 
pocket, for such services; (3) searches are often interdisciplinary and 
require several sources; (4) various publicity mechanisms are helpful but 
none so important as satisfied users telling their colleagues; (5) users 
like and respond positively to the in-depth, customized serx^ice and per- 
sonal attention to their bibliographic needs; (6) extensive* trainintj and 
practice of Information Specialists is necessary to attain do,sv .^^ \>vels 
of service quality? (7) integration of these services within iv 
environment may require organizational and staffing accommodai^ idi- 
tion to the personal cooperation, goodwill, and enthusiasm of i:ar' , ipants * 
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I. INTRODUCTION AMD OVERVIEW 



An experimental, pilot operation of computer-based reference 
search services to users on a fee-f or-service basis was initiated 
at M.I.T. on lb November 1973, It marks a major milestone in the 
development of the Northeast Academic Science Information Center 
(tJASIC). NASIC development is supported by a grant from the National 
Science Foundation to the New England Board of Higher Education 
(HEBHE) . Development of a pilot operation at M.I.T. is supported by 
subcontract from NEBHE to H.I.T. The NASIC at M.I.T. project team 
includes staff from the M.I.T. Libraries, the ElectronicSystems Lab- 
oratory, and the Information Processing Services. This report covers 
the work performed on NASIC at M.I.T. from 16 July 1973 through 28 
February 1974. This period falls within Phase 1 of NASIC. 

Services for a fee have been provided to more than 60 users. Al- 
though the number of users is small at this still early stage, the ranks 
of users continue to grow with the increasing publicity about the avail" 
ability of KASIC services. The effort to reach this stage by the M.I.T. 
project team encompasses, among other tasks, selection of services, 
training for services, developing the initial organizational and opera- 
tional policies and capabilities, the publicity about available services, 
and the operations monitoring procedures. A philosophy fundamental to 
this effort is to integrate such services within the library environment. 
Details of the work accon^lished on each task are contained in subsequent 
sections. The reader is referred in particular to Task 12, Monitoring 
and Analysis of Service Operations, for an extended analysis of results. 

It is worthwhile highlighting the more important findings to date 
of this development and testing effort. They are: 



1. A demand exists for computer-based reference search services 
available on a f ee-f or-service basis, 

2. A measure of the strength of the demand is that many users are 
willing to. pay out-of-pocket for such services, although the majority of 
users to date have access to contract or grant monies, 

3. Mechanisns need to be established to support large-scale use of 
NASIC services by undergraduates and others who do not have access to 
grant or contract monies- 

4. These services complement but do not replace more traditional 
search modes . 

5- A significant number of user search problems are interdisciplinary 
in nature and may require searches in more than one data base, 

6- Various publicity mechanisms are successful but, not unexpectedly, 
the most important one seems to be satisfied users telling their colleagues 
about NASrC< 

7< Users like and respond positively to the in-depth, customized 
service and personal attention to their bibliographic needs. 

8< To provide quality service Information Specialists need to receive 
training including fairly extensive practice searching. It takes additional 
experience for an Information Specialist to become fully confident, adept, 
and at ease with his or her professional ability to provide an intensive 
customized computer search, 

9. To date, a typical NASIC user appointment lasts 70 minutes < Some- 
what more than half that time, 37 minutes, is spent in on-line connection 
to the computer* The average printout request results in 39 pages con-* 
taining 131 citations. The total cost of a typical search is $50.47 com- 
posed of $34<90 for computer connection and search and administrative 
charges, 5 9,36 for the time of an Information Specialist, and $ 6-21 for 
off-line printout. However, only 60% of users request offline printout, 

10, Integration of these services within the library environment may 
require organisational and staffing accommodation in addition to the per- 
sonal cooperation, good will, and enthusiasm of participants. 

The discussion by task which forms the body of this report relates 
to the M,I,T, environment. The NASIC at M.I,T, organisation reflects this 
environment as well as the dual purposes of providing user services and 
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of developing and testing methods and types of services. As sach, the 
current setup at M<I<T< represents only one of many organi;;ational models* 
we want to experiment with other models < The particular method that other 
institutions use to organize NASIC computer-based reference services on 
their own campus must reflect their own local environment and needs* 
The M<I»T< experience ought to be of help to other universities in setting 
their own course. 

There are several aspects of the work accomplished within the short 
span from corimencement of M,I,T» effort on 16 July 1973 to 28 February 1974 
which cannot be adequately expressed either in the highlights above or in 
the following summary by task- NASIC is an addition to library services 
at M,I,T, and not a replacement for traditional services* This is expected 
to hold true for NASIC service sites at other educational institutions in 
the region* The extension of services in a short time span to include NASIC 
has caused an additional burden to be carried by the M<I<T< Libraries* The 

Libraries did not have benefit of time for budgetary planning- Never- 
theless, whenever problems arose, the long view of the situation was kept in 
mind by all concerned- Some of these problems may also arise at other sites and 
NASIC and the participating institutions need to give it full consideration- 
Regular work loads and personnel assignments in the M-I-T- divisional 
libraries have been disrupted during the transition period to build, operate, 
and continue further development and growth of NASIC services- The Informa- 
tion Specialists themselves have shouldered most of the burden, but in the 
process, and with their enthusiasm, they have shown their mettle* The 
administration and staff of the M»I»T» Libraries have enthusiastically 
supported efforts to make NASIC services a success while tolerating the 
transitory hardships to personnel and regular work loads associated with 
the magnitude of their uffort- 

Despite the usual trials in this development effort, it is obvious 
to us at M-I-T- that no matter how much or how little is available in 
dollars or in time, people still make the difference* Enthusiasm and 
committment are prerequisite for any university library staff that wishes 
to extend its services to computer-based reference and retrieval* 
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At M.I.T.f we have wrestled with an important question. 
If each university sets its own operational course what then is the 
role of the Northeast Academic science Information Centet? 
NASIC does have a role to play, a very positive role, W(i have 
arrived at this answer by careful review of the activities and events 
to launch and carry forward NASIC services at M.I.T. in which the M.I.T. 
departments that have cooperated in this venture can be likened to a 
NASIC network in miniature. The major functions of a strong central 
regional NASIC organization aret 

1. Advise academic institutions on preparing for, implementing, 
and publicising computer-based reference services. 

2. Offer programs to train staff to levels of competency in under- 
standing and providing such services that extend beyond current programs 
of retrieval system suppliers, 

3. Provide a central capability to search those systems or data bases 
that are only of infrequent use to an academic institution, 

4. Provide a strong, collective voice for tne region in dealing with 
retrieval systems, database suppliers, terminal manufacturers, or o\:her 
external agencies, 

5. Provide a mechanism for disseminating within the region updated 
information and solutions to problems of common interest. 

In short, a regional NASIC is needed to function as a strong user 
association, a center with the expertise and time to daily make suggestions 
and provide feedback among individual academic institutions and a variety 
of diverse information or equipment suppliers. 

Our prognosis for NASIC remains optimistic. It is possible for an 
organization to implement these services entirely on its own; but in so 
doing, more of its resources will be required in order to fully realize 
the benefits from extending its services to both current and new library 
users. These services are exciting because they ultimately touch upon, 
indeed should be integrated with, a wide spectrum of information services, 
but they are also exacting in their implementation if their potential is to 
be realized. A NASIC that functions as a strong central association of 
members could considerably ease this process with consultation, with training. 
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with back-up services, with collective voice to suppliers, and with feed- 
back to members. 
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II. DESCRIPTION OF PROJECT WORK BY TASK 



Obtaining Service Data From External Suppliers (Task 1 ) 

Data describing the characteristics, modes of access and costs of 
available external online and offline bibliographic services was gathered 
and reviewed. This data gave assistance to the selection of retrieval 
systems and data bases for initial NASIC services. Sufficient data for 
both on-^line and off-line modes is on-^hand to assist in the selection of 
additional data bases as sources of NASIC services. However, the external 
retrieval systems and data bases undergo continual modification and change 
so the data gathering effort also continues in order to keep abreast of 
such changes . 

Part of the data gathering effort has taken place in conjunction with 
the NEBHE-ARL site survey study. M.I.T, has participated directly in site 
surveys at the University of Georgia, Illinois Institute of Technology 
Research Institute, University of California at Los Angeles, Ohio State 
University, University of Florida, and the North Carolina Science and Tech- 
nology Research Center. M.I.T.'s role in these visits has concentrated 
mainly on the current and planned services and operations of these centers. 

A visit with the staff of the System Development Corporation yielded 
new information and data on ORBIT- M<I,T. has received visits from repre- 
sentatives of SDC, Loc^fheed, and the University of Georgia which have also 
been beneficial in obtaining recent data. 
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Selection of Services (Task 2) 

The ultimate long-range NASIC goal is to provide access to most 
computer-based information services* Four areas of interest have been 
considered: (1) data bases f or machine-readable bibliographic 
or other surrogate records; (2) data banks, or machine- read able numeri- 
cal data, either raw or reduced; (3) text files other than surrogate 
records that are computer- stored; (4) non-computer stored (traditional) 
information. 

The first area^ data base or bibliographic access^ was a given for 
NASIC services and it forms the core of NASIC activity for the forseeable 
future- NASIC will use the services of existing retrieval systems and 
data bases rather than develop processing capabilities of its own< ;^hile 
access to all such bibliographic services is desirable and necessary for 
comprehensive coverage ^ an initial set of services had to be selected- 
Other bibliographic retrieval services will be phased in over time. Time 
to accomplish training of personnel is a critical element in the intro- 
duction of all such services- 

NASIC services at M,I,T, were initiated online with the SDC ORBIT 
system and offline with the University of Georgia system. The SDC ORBIT 
system was selected for several reasons. It is somewhat easier to learn 
to use the ORBIT retrieval system than most others ^ It is accessible 
via the TYMNET network making communications easier. It had^ last August, 
more data bases of interest to the M,I,T, community. It is essentially 
the same retrieval system as the National Library of Medicine MEDLINE 
system access to which was being arranged concurrently but independently 
by the M,i.T, Science Library, The University of Georgia system was 
selected because it has the largest array of data bases in the country, 
many of which are searchable in both retrospective and current awareness 
modes . 

Specific data bases available on those systenvs were also selected 
for initial operations. The on-line data bases include the Chemical 
Abstracts base (CHEMCQN) , plus data bases in education^ linguistics and 
information science (ERIC) and in business, management, and economics (INFORM), 
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These data bases are of interest to large segments of the M,I,T, commun- 
ity, other data bases available at that time on ORBIT but not selected 
were in medicine (MEDLINE) and agriculture (cain) . There is a large 
interdisciplinary interest at M*I*T, in tha medical data base but sDC*s 
service was not selected because concurrently with NASIC services ^ MEDLINE 
service is being made available (and at lower cost) by the M,I*T, Libraries 
through arrangement with the National Library of Medicine* However ^ a coop- 
erative spirit and effort exists at M, I,T* in providing NASIC and MEDLINE 
services to the ultimate benefit of both the user community and the growth 
in usage of computer based retrieval systems in general. 

The offline data bases initially selected for access from the Univer- 
sity of Georgia are the Chemical Abstracts Condensates and ERIC- This 
selection complements or extends the data base time period coverage and 
type of service provided by the on-line data base- Thusf a full range 
of services is being provided for each data base whenever possible. Retro- 
spective searches are available both on-line and off*line. Current aware- 
ness searches are available off-line, although ORBIT has plans to provide 
such a capability. The relatively new INFORM data base is the only sDC 
data base that does not have an off-line counterpart at Georgia* 

since the time of selection of initial data bases for NASIC, 
SDC nas made available data bases in engineering CC0MPENDE5C) , 
geology CgeO-REF) , and is expected soon to make available 
government reports (NTIs) , and a citations index (ISIK NTIS, CAIN^ 
COMPENDEX and GEO'^REF are currently available off-line at Georgia, While 
a detailed plan for phasing-in other data bases and retrieval systems is 
in preparation, these particular data bases are of interest to the M,I,T, 
community and there is a high probability of adding them next, A con- 
tributing factor in the choice is that specialist training in the use of 
these data bases can be accomplished sooner than would be the case iff in 
addition, a new host retrieval system first had to be mastered. However ^ 
the effort in testing additional retrieval systems for eventual use is 
continuing and is now being aided by the use of real search problems* 
Indeed, comprehensiveness of bibliographic data base coverage is one of 
the most valuable features that NASiC can provide. 
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The NASIC effort is being concentrated on bibliographic data base 
access, NASIC access to data banks of machine- readable numerical data, 
either raw or reduced, such as, for example, the census, was considered. 
However f numerical data bank access was deferred for the present in order 
not to dilute the available resources necessary for successful training 
in data base access and because information specialist personnel with back^ 
grounds additional to that of data base specialists, particularly in program- 
ming, are required. However, M,l,T, and NEBHE studies of this area do indi- 
cate the desirability of havinq effective interaction and referral between 
data bank specialists and data base specialists* 

Further consideration of the third area^ machine-stored, full-document 
text files has also been deferred because of the very limited number of 
sources at the present time. 

It is particularly important to conclude this summary on selection 
of services with a note that a conscious and continuing effort is being 
made to develop an effective interface between the NASIC computer-based 
information services and both traditional searching and document delivery 
systems. While NASIC is primarily concerned with computers-stored infor- 
mation^ non-computer-stored information is of particular concern for at 
least two important reasons, both of which are highly likely to influence 
a user's opinion of the effectiveness of NASIC services. Firsts most, 
if not allf machine-stored data currently available, have limited retro- 
spective capabilities (on the order of a few years for most bibliographic 
sources) , Some number of users will have need to search further back in 
time. Hence an effective interface needs to be designed between computer- 
based information services and traditional searching. Seconds both tradi- 
tional and machine-based bibliographic searches on document surrogates 
usually generates a need to obtain access zo the full documents. Hence, 
an effective bridge between machine-based NASIC services and a document 
delivery system (frcxn holdings determinations through delivery) also is 
highly desireable. The initial N'ASIC effort is giving full consideration 
to the interface with traditional resources- For example ^ the reference 
staff is being given sufficient information to be able to refer users to 
NASIC? the reverse situation also arises because some users who come to 
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NftSIC have information needs answered in part by, or only by, traditional 
sources. Ir the next phase, we plan to experiment with document delivery 
services as a means for following through and completing the retrieval 
function* 

The ultimate goal is, of course, an integrated set of information 
services available to a user community at M*I,T* or at any other NASIC 
site . 
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Library staffing and the Selection of Information Specialists (Task 3) 

An initial core of five Information Specialists were selected, one each 
from the reference staffs of the five Divisional Libraries M.I.T,, to 
provide hASIC services pairt-time, A sixth Specialist was subsequently 
brought on board. The Information Specialists aret 

Ms* Marge Chryssostomidis - Barker Engineering Library 

Mrs- Pat Gordon - Science Library 

Ms- Ann Longfellow - Rotch Library 

Mrs* Jackie Stymfal - Dewey Library 

Ms- Nancy Vaupel - Huiaanities Library 

Ms- Susan Woodford - Science Library 

The selection of Information Specialists reflects the general organ- 
isation of the M-I<T- Libraries which are decentralized into five divisions 
corresponding to each of the five schools (Engineering, Science, Architecture 
and Planning, Management, Humanities and Social Science) at M-I-T- Each Divi- 
sional Library could be expected to interface in some way with NASIC, To 
handle the interface and to generate and maintain interest in HASIC through- 
out the total Library system, the Library Administration selected a part-- 
time Information Specialist from each Divisional Library rather than one or 
two Specialists for the total library system- 

The Library Administration asked the Head of each Divisional Library 
to recommend one of their staff members for the job- Although no formal 
selection criteria were established^ in point of fact the most important 
informal selection criteria were previous experience with computer-based 
services and/or a high level of interest or enthusiasm in such services. 
Selection of Specialists was accomplished without any difficulties. Two 
of the initial core of five Specialists had previous experience with com- 
puter-based bibliographic services, in particular, Intrex, MEDLINE, and 
several batch systems, Personal interest and enthusiasm from the other 
Specialists did indeed rank high, (one concurrently took a programming 
course at M- I <T- ) - 

The M-X-T< Libraries providing* in parallel with HASIC services* 
access to the National Library of Medicine MEDLINE system. The Libraries sent one 
.•staff member from the Science Library to participate in the 3-week Medline 
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training program, while NASIC and MEDLINE services are parallel services, 
from a user's viewpoint* this represents proliferation and is a potential 
source of confusion for him. It seemed highly desirable to provide a single 
point of access to any computer-based information source for the M.I.T. user 
community, A coordinated effort can be and was implemented with greater 
convenience to both user and staff while also providing appropriate sponsor- 
ship credits and maintaining appropriate cost allocations ^or contractual 
obligations. To this end, the initial core of five NASIC Specialists wasenlarge 
such that the MEDLINE trainee was additionally trained to provide 
NASIC services also, KEDLXNE is a member of the ORBIT retrieval 
system family and this fact has facilitated the coordination. 

Although an initial core of NASIC Information Specialists had to be 
selected, we would expect that* in time and with increased user demand, the 
entire reference staff of the M,I,T, Libraries would be trained to provide 
computer-based reference and retrieval services. Indeed* several members 
of the reference staff have expressed such an interest. 

Development* implementation, and the management of operational activ- 
ities within the Libraries requires coordination, a fact recognized from 
the outset in pre--ccntract discussions. The decision later to use a de- 
centralized staff of Specialists underscores the role of a Coordinator, 
Ms, Mary Pensyl is the NASIC Coordinator and she has had previous experience 
in the use and promotion of computer based information services. The NASIC 
Coordinator reports directly to the Library Administration. As we move into 
Phase II* the NASIC Coordinator will be less concerned with development 
and implementation but instead, address more attention to publicity and 
marketing. 

Subsequent development of service operations at M,I,T, pointed up a distinct 
need for support capability. An Assistant to the NASIC Coordinator, Mr, Phillip 
Piper, was hired to carry out the duties of manning a telephone to receive 
and answer inquiries* to schedule and carry out all arrangemenLs for service 
appointments* to maintain user files* to handle billing and accounting pro- 
cedures that flow through the Coordinator's Office, to distribute all off- 
line printouts sent to this central office, to gather data on the service 
operations and to assist in its reduction, 

ERLC 



Access to External Services (Task 4 ) 

Accounts have been obtained by NEBHE for use by M,I,T, to access the 
SDC ORBIT system in an operational service environment. Other SDC accounts 
obtained by M»I<T< are being used for training and experimental testing 
purposes < Two accounts have been obtained by NEBHE for use by M<I<T< to 
use the University of Georgia System; one account will be used for service 
operations and the other for training and experimentation < Accounts have 
also been obtained by M.I*T< to access the Lockheed DIALOG and the Battelle 
BASIS-70 systems for testing and training pruposes< 
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Training Program (Tasks 5 and 6 ) 

A training program began in late August 1973 and continued through 
IS November 1973 in order to achieve by the time of the initiation of 
NASIC service operations a comfortable level of understanding and ease of 
use of each of the initial NASIC retrieval systems and data bases by all 
of the Information Specialists, Training was particularly intensive in the 
few weeks immediately prior to initiating services- tAt H,I,T** MEDLINE 
is being coordinated with NASIC as a service activity, but MEDLINE was not 
part of the NASIC training program, MEDLINE services are currently being 
provided only by the two Specialists from the M*I*T, Science Library both 
of whom had previously received MEDLINE training at NLM, Their MEDLIUE 
service activities are in addition to their NASIC service activities-) 
Since 15 November 1973, a small amount of continued training has occurred 
to keep abreast of changes in the initial retrieval systems and data bases, 
and to discuss and review the operational experiences of the Specialists - 
Of course* additional retrieval systems and/or data bases that are to be 
accessed as a NASIC service will require additional training effort equiv- 
alent to comparable segments of the initial program* 

The initial training program required approximately 130 man-hours 
(3-7 man-weeks) of effort per information Specialist, About 40 man-hours 
effort per Specialist were devoted to the first module of the training pro- 
gram- Bibliogranliic and information science concepts independent of specif- 
ic retrieval aysterr.a and data bases were covered and a general philosophy 
of service was d,i:icussed* Major topics in Module A were: 

— search problem elicitation 

— profile or search concept development 

— Boolean concepts 

— natural language and controlled language characteristics 
retrieval effectiveness concepts 

— other search strategy techniques 
user search satisfaction criteria 

The second and third modules together represent the bulk of the effort 
and together required 90 man-hours of effort per specialist* The second 



module covered the specific protocols, coiranands and other characteristics 
of the ORBIT and the University of Georgia retrieval systems, and the 
specific characteristics of the Chemical Abstracts, ERIC, and INFORM data 
bases as they are applied on those systems* The second module also included 
hands-on experience at terminals for ORBIT (approximately 10 hours of 
connect time per Specialist) and the development and running of profiles 
for Georgia » Several real users participated in the training effort pro- 
viding us with real search problems and the opportunity to conduct refer- 
ence interviews and to obtain feedback on terminology and search strategies* 
The third and smallest module was concerned with the NASIC at H»X»T» pilot 
service operational procedures and covered administrative, accounting, 
service, feedback and analysis, and similar matters* 

The training program was an amalgam of lectures and discussions, 
individual study* practice online sessions in pairs and individually, 
user interviews, and offline profile development. Training materials 
included assigned readings, system and data base descriptions and manuals, 
system newsletters, and retrieval aids such as thesauri* NEBHE personnel 
frequently attended the lecture and discussion portions of the program. 
In addition a day and a half long workshop was held at H.I.T. by Mrs. 
Margaret Caughman of the University of Georgia to review their system and 
the two data bases being accessed, to cover profile weighting, and to review 
profiles submitted by H»I.T» for training punposes. Mr* Ran Hock of the 
University of Pe^lnsylvania also attended the workshop. 

The topics covered during the initial training program are to be 
reviewed during the next phase of kasiC and reassembled for use in a program 
to train Information Specialists at other MASIC sites to a similar level of 
preparation and understanding in the use of computer-based information 
services. It is expected that the training program itself will be recast 
to better meet the needs of geographically dispersed trainees* It is antici- 
pated that greater use will be made of guidelines, of "tutoring'^ by exper- 
ienced Specialists, of apprentice search sessions. A lot of information 
must be covered together with a lot of practice, but it is anticipated that 
the training program can be so segmented as to allow time for a prospective 
Information Specialist to absorb the material. 
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Development of Operational Procedures at t^.I.T. (Task 6 ) 

An intensive effort to develop and set up operations for a pilot ser- 
vice took place along several fronts- The efforts can be categorized as: 
(aJ library and other sites for servicer (b) terminals for on-line services; 
(c) service modes; (d) service charges; (e) accounting and billing mech- 
anisms; (f) data gathering for management information and statistical pur- 
poses • These efforts on initial development have essentially been concluded 
Details are given on the following pages* Modifications to the initial pro- 
cedures will be made on a continuing basis as further experience f feedback 
and analysis iviay dictate* 
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Library and Other Service Sites (Task 6A) 

NASIC services are being offered through each of the five divisional 
libraries at M*I,T* Each library except the Science Library was surveyed 
together with the Head of that library in order to select particular lo- 
cations within the library suitable as a NASIC service site. Criteria 
used to select a site included: visibility of on-going NASIC services to 
other library users; sufficient physical space for both users and for small 
group demonstrations; electrical and telephone installations? noise level, 
general environment, and traffic flow; ease of access to the reference col- 
lections ease of access to quarters for secure storage of terminals. As 
you might expecti some tradeoffs had to be made* In some instances > part- 
icularly where large expenditures for physical improvements might have been 
required for an otherwise desireable location, temporary sites are being 
used instead pending review based on operational use* The site being used 
in the Science Library for NASIC services had been previously selected for 
MEDLINE service and it meets all of the criteria established for a NASIC 
site. That site has been strikingly and attractively decorated by the M,I,T- 
Libraries and it serves as a model for the other l^iDraries. Sites outside 
the Libraries at which NASIC services are to be offered upon user request 
and on a test basis include the on^campus offices and laboratories of the 
user coimnunicy- The object is to increase further the convenience of access 
to NASIC service for users - 
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Terminals for On^-Line Services (Task 6B) 

Texas Instruraents Silent 100 thermal terminals with upper/lower case 
capability were selected for initial UASIC services, They are quiet, re- 
liable, portable, and operate at 30 characters per second output, all 
characteristics that mal^P them highly suitable for operation within a library 
environment. Two such terminals have been leased by the Project and, to- 
gether with a third such terminal leased by the M,I.T. Libraries, are 
being shared among the five Divisional Libraires for WASIC services. A 
fourth terminal leased by tlie Electronic Systems Laboratory is being made 
available to NASIC for ba^rk-up purposes. While these are portable terminals, 
they do weigh*in at 30 pounds. Terminal logistics have be',n a bit of a 
nuisance. On occasion, the Information Specialist and user have travelled 
to the terminal rather than the reverse. Until user demand justifies addi- 
tional tentdnais such that there will be at least one per library, we may, 
in the next Phase, reduce the nuirber of locations at which WASIC services 
are available. In addition. Specialists seeking online practice training 
with additional data bases have occasionally been hampered because the 
shared terminals were not close-by. 
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Service Modes (Task 6C) 

The reader may appreciate that in preparing for NASIC services, we had 
to work with two unknown quantities, namely, initial user demand and its 
rate of growth- Therefore, considerable attention was given to determining 
the best modes of pilot service operation that would simultaneously periviit 
convenience of service to users? controlled growth of service operations 
to meet demand when and where it arises; control over services for monitor- 
ing and analysis purposes; flexibility for easily modifying service pro- 
cedures; flexibility in working with time slots when only certain on-line 
data bases are available? flexibility for both NASIC and for the Informa- 
tion Specialists in interfacinq with traditional library activities; and 
flexibility in adding new operational services- 

An appointment basis for service was selected over ;on*-deinand methods 
as it provides the best accommodation of both flexibility and control, part*- 
icularly with limited personnel resources. In addition, the appointment 
places NASIC services on a more professional footing because the user is 
assured of undivided attention and service at the appointed hour — not a 
small matter when he is paying for service- 

Each of the information Specialists has been trai.ied to work with each 
of the initial NASIC retrieval systems and data bases (MEDLINE excepted) » 
A user may obtain service at any Divisional Library, or upon request, at 
his office or laboratory, A logistics schedule for appointments was drawn 
up jja'^ed upon personnel and the time of day that online data ba^^es are 
available. A Specialist, although attached to one library, may be asked 
to meet an appointment elsewhere- Appointments are scheduled centrally 
through the NASIC Coordination Office, 

The initial appointments schedule (see Figure Bl) theoretically is able to 
accommodate a total of 75 hours of service, 54 hours on-line and 21 hours off-- 
line. The schedule allows the user some flexibility in arranging an appointment- 
The total is twice the number of hours we anticipated as being actu lly 
necessary t^ service a demand rate anticipated to grow to 25 users per week 
after the first few months of operation. Each Specialist has been scheduled 
to cover at some point each of the data bases and systems (on-line and off- 
line, and retrospective searches as well as current awareness profiles) 
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although some weighting has been done to provide more hours of service 
for the chemistry data base by the Specialists from the Engineering and 
Science Libraries* MEDLINE services have been integrated into the appoint- 
ments schedule but such service is available only from the Science Library 
Specialists trained at NLM* All appointments arc nominally scheduled for 
one-hour but the schedule allows an additional half-hour to catch late- 
comers and run-over sessions^ as well as to allow some time for the Special- 
ist to complete a write-up about the appointment for later monitoring and 
analysis. Appointments are scheduled for the time slot and location most 
convenient to the user. With a moderate demand level and current staff 
committments, we anticipated no more than a two or three day wait. The 
appointment mode has proven to be useful in practice but detailed dis* 
cussion of the match between the plans on which service appointments are 
based and our real operational experience appears below in Task 12, Moni- 
toring and Analysis of Service Operations. 

In Phase 2, we want to experiment with other modes of service. These 
include delegated searches, SDI online services, on-demand searches with- 
out an appointment, user self-searches, service in an office or laboratory, 
and document deli\'ery services* Document location and delivery services 
would provide an important follow-through to any of the search modes since 
the search result only provides references to documents- Document delivery 
also promotes integration of computer-based searches with more traditional 
library activities* 
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Service Charges (Task 6D) 

A pricing structure for nasIC services at H,I,T, was developed. 
The structure is a compromise between the complexities associated with 
accurately predicting all costs when we have no historical cost data to 
go by, and a toe simple flat rate that would reflect real costs inaccurately* 
The pricing structure has four main components whose sum is the total cost 
to the user. The components are (1) a direct computer search cost to 
which a surcharge is added for recovery of administrative costs^ (2) a 
direct charge for the time of the Information Specialists (3) a direct 
c large for the cost of off-line printouts, (4) charges for special ser- 
vices- 

(1) Computer s.earch costs are established as rates based upon either 
the number of hours of terminal connect time for on-line services, or cn 
the number of years of data base coverage searched for off-line services < 
The rates are retrieval system and data b ase dependent* Table 1, a draft 
price list, shows the rates in effect from 15 November 1973 to 31 January 
1974< Table 2 gives the rates in effect for February 1974 < The change 
in the rates for some data bases reflects changes in the supplier's rates* 
The ccmputer search cost rate includes (a) the actual rate charged by 
the external siervice less any discount provided to NASIC > (b) a rate 
surcharge added by KASIC at NEBHE to support the regional organization, 
this surcharge initially equal to the discount received from the external 
service; (c) a rate surcharge added by H<I<T< to defray its expenses in 
providing central personnel, telephones, terminals, and materials necessary 
for service. The H<I<T< administrative surcharge included in the search 
cost rate is currently set at $12 per connect hour or $6 per profile- 
year. The M-I-T. administrative surcharge of $12 per connect hour was 
derived by assuming adhievement within the first year of an operational 
level of 125 us^^^rs per month, with the average user requiring an hour of 
Specialist time and 30 minutes of terminal connect time. It was also 
assumed that this volume would require a Coordinator at 20% full-time, 
and an Assistant to the Coordinator at 60% full-time* Monthly costs were 
estimated for salary and benefits for the Coordinator and the Assistant, 
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Table I 

nnTIAL RA'rESI Er-PL-CTIVE 15 NOVEMBER 1973-31 JANUARY 1974 



M*r*T./^rASic PRICE List (draft 11/29) 

ThG following prices for services provided at M.I.T. for NASIC 
and othor information sorviccs will he in effect until February 1974* 

A* sPeciB'rc iPRtcKs 

X* Info r mation Spocaalist i SB/hr. (minimum charge $13) 

[This chtirge is currently being credited — see (B) boXow] 

2. Offline fi j^injl^out^ : fO.lO/p^age 

(output onto ^ K S cards: 50. OS per card) 

3. gp oj^^l Servqces i (prices not yet worked out) 

4* Cgni puter Search i (see Table below — charges in dollars) 
[minimum charge for J^bmputer search: $S] 





Type Of Computer Search 


Data Base 


*** 

On-line 

(per connect-hour 
at tenTulnal) 


** * 

Current Awareness 
(annual subscription) 


**** 

Offline Retrospective 
(per year of data base 
Searched) 


CA-Condensates* 


$55 


S370 


$166 


ERIC* 


$44 


$ 86 


$ 76 


INFORM 


$67 


OL 


OL 


MEDLINE** 


SIS 


OL 


OL 



OL ^ service availal>le only online 

* 

The chemical ^nd education data bases are each divided into two parts for 
offline searching* If your problem can be handled by only one part^ pricing 
can be cut in half* See brochures on these data bases for details* 

** 

The KKDLir^E ("otiica] Onl 1 nf>) service is being provided by M,I*T* in 
coopcratic'n with the Naliicnal Library of Medicine which subsidizes the 
rr^cTjor i^>i*. ion of the ccFSts. One third of the MEDLINE computer search charge 
CS6/hrJ <ioc"3 to r4LH, the remainder goes to M-I-T- NASIC does not 
participsLe in IHDIJME service and colli^cts no payment for it* 

** 

Charge piorated if loss than hour or year of service* 

** 

Offline^ retrospective searching for the current year is charged on the 
basis ot the current awareness rate — pro rata for that portion of any 
inconplote volumes to be searched* 

B- IMTROUtUCTORY CREDIT OFFER 

As an introductory offer M.I*T, is crediting each r^ew user's 

account with a total credit of $50 v;hich can be used to defray the 

charges for the information specialist* This credit is limited to 

M-I.T* users and tnust be used before the end of this academic year 

in June* 1974^ 
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Table 2 

SERVICE RATES EFFECTIVE FOR FEBRUARY 19 74 




mSlC AT Mir 

AUTOf^TED BlBLJOCRAPHlC 

Services for Research 



COST OF SEftVlCtS 

Effective February 1 , 1974» costs fOr ftASIC search services 
are as listed below* 

fnfonrjt l oft SpecbHst ; S3/hr fffiinlmum S5)* This cha^-ge is 
currently nein^j credited for MIT faculty* students and 
stdff — see (A) on reverse side. 

Computer Search : As shown in the table belo*^» there are tw 
rateSt one ror educational and governiaent users (EDLX) ^nd 
One for corrieTCial users {CQtyi). In either cflse the 
minfmum Charge is SS* 

Off-line Printouts ^ssociaced with the computer search 
nay involve Additional Charge as shown in th^ table* 



UH-L1HE SEARCH 
per terminal 
connect hour 



Off-l1n<? 
Printout 





EClC 






CA-COn<j* 


S 67 


S B2 


s.oa/cit* 


ERIC 


J 47 


S 62 


j*oe/cit* 


IHFORM , , 


S 67 


$ 82 


S JO/cit. 


HEDLrpJE*-^ 


S 18 


J 18 


S.lO/p^ge 



0 F F -iViZ J. E TftOSP E_CT I VE 
per yGAT 



searc hed 

COMKL 



Printout 



CA-COnd. 
0£jd or even 
Odd and even 

ERIC 

"IfTE or CUE 
and C[J& 



S SS 
SI66 



$ 9S 
S176 



41 

BG 



fcitation only 
paper - free 
card - S*02/Clt* 
I abstract tt cit. 
paper - SJO/cit* 
card - S.R/cit. 



DATA 



Annva l sub^c i ' Ptjion Printout 



F:ArCond^ 
odd or even 
Odd and even 

FRJC 

^E CUE 
RJE l^d CUE 



sigo 

S370 



seoo 



46 
&6 



rcit^lion Only 

I paper - free 
J card - S.02/Cit, 
C abstract 1 cit* 

I paper - S,IO/cit. 

1 card - SJ3/<;it. 



For further ^ttfOrr^tion see the detailed ^EAjIC at HlT 
brochures or ccnuct the N^Slt Coord^:^a^o^'$ ofnco: 



Room 10^400 
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^ SAMPLE COST CALCULATION 

Assutne an INFORM search by z*: academic user from MIT 
takes ^5 Piinutes of f ' time of an rnformation 
Specialist (3/4 X S8 - 56lO0}i 30 minutes Of terninal 
connect tin-e (1/2 ji S67 ^ S33.50) df^d results in a; 
citations being printed off-line (4? x SJO * S4.eO). 
Total cost is la3HS0i of which the 5$. 00 for the 
information Specialist's time will tfC credited* 
leaving a net cost of S37*70. 

A cocrmercfal user's cost 'or the sdire service would be 
S6.00 t 0/2 X S82) ^ SA.20 ' SbKjC. 



EXPLAHATORV tjOTES : 

(A) OurinQ the 1973-7a academic year, each account of 
a user from HIT wilT be credited with S50 
applicable to charges for the ti:tie of the informa* 
tion specialist* 

{&) The MEDLINE (Medical On Hne ) Service is being 
prOvfded by K^T'in cooperatfiJ': wit^i ihG U^iiOfi.i] 
Library of Medicine which subsidises the m^jor 
portion of the costs. One third o*' ihe MEOlT^l 
ctmpwter search charge (S6/^r) goes to ULM^ tho 
remainder goes to MIT. NASIC does not Participate 
fn HEDLINE service and collects no payment for it* 

fC) The charge for Ctirrtint Awareness is prorated for 
subscriptions of U'Ss than One year* 



For further infomation, 
brochures or contain! tiie 



>^ detailed NASIC at MIT 
Cci^rdinator's Office^ 
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"Northeast Academic Science 
Uforhatioh Crr^TtR 

A Program Of the New England Board of Hftiher Education^ 
NASIC is supported by the riational Science Foanriition 
undfir Grant No* ^U372^%. 

2/7A ^ 4H 
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for terminal rental, for terminal paper, for telephone and message units, 
and for miscellaneous supplies. These costs were fironrated among the 
assumed monthly user volume and translated into a connect-hour charge on 
the basis of the assumed half-hour average connect time per user. The 
M.I.T. administrative charge of $6 per profile year for offline searches 
was derived in the same way except that terminal cost and terminal supplies 
were excluded^ and it was assumed that the average user would obtain a 
one*-year current awareness profile or a one-year retrospective search. 
Beginning 1 February, NASIC services were extended to users affiliated 
with industrial or commercial orgainizations* At the request of NEBHE, 
a surcharge to this group of users was instituted of $15 per connect hour 
for online services or $10 per profile year for offline services. The 
indusi-rial user surcharge is passed completely to NEBHE to help defray 
the real costs in developing NASIC. The National science Foundation 
grant is paying the development costs for the academic community. There 
is no industrial surcharge on MEDLINE service because it is not a part 
of NASIC. The total charge to a user for the computer search cost conb 
ponent, including all applicable surcharges, is pro-rated for online searches 
under an hour, or for offline searches for less than a year* but there 
is a minimum charge of $5. 

(2) Information specialist time is charged at the rate of $8 per 
hour during an appointment with a minimum charge of $5. No charge is made 
when the Specialist and user together decide that a NASiC service would 
be inappropriate for the user's problem, or when a potential user seeks 
general information about NASIC services* The hourly rate for the In- 
formation Specialist was derived assuming a one-hour user appointment 

and an additional 10 minutes preparing for user sessions or for post- 
session clean-up. The rate is based on salary plus employee benefits* 

(3) Off-line printout charges for the period 15 November 1973 

through 31 January 1974 are shown in Table 1 and for February 1974 in Table 2. 
The initial lates were all on a per-page charge andf for simplif ication, 
had been set at a uniform cost for all data bases. Beginning 1 February 1974 
new rates were set to reflect a significant change in the actual charge basis 
used by SDC. Their rates are now currently on a per-citation basis* At the 

O 
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time the SDC change was impending, we thoroughly reviewed the basis of 
the initial WASIC printout charge in terms of cost recovery. We concluded 
that the printout component of the NASIC price structure for a given data 
base should be in concert with the structure of the data base supplier. 
Thus, the printout component is now retrieval system and data base de- 
pendent and is similar to the computer search component except that the 
printout component contains no surcharge for administrative costs. Re- 
covery of overhead associated with printouts is to be included later with 
other indirect costs in the connect-hour surcharge, 

(4J Special services will be charged for in addition to the above costs. 
Such services may include, for example, document reproduction and delivery, 
No special services havfi been developed during the Phase 1 period nor have 
costs and prices for anticipated services been investigated. 

In order to aid in the introduction of WASIC services to the M,I,T, 
community* the M,I,T, Libraries are forgiving users the charge for the 
Information Specialist time up to a maximum of $50, This is the equiv- 
alent of about six hours of appointments. The credit is limited to M,I,T, 
users and must be used before the end of the academic year in June 1974, 

Some readers may be interested to know that in accordance with a new 
National Library of Medicine policy to allow all MEDLINE users to obtain 
a standard mode of service at a standard price, we have announced the avail- 
ability of a "standard" MEDLINE search at a fixed fee of $7,50, The stan- 
dard search as defined by the National Library of Medicine is one requir- 
ing less than a half-hour of Specialist time, less than ^0 minutes connect 
time, and less than five pages of printout. Users at H,I,T, who desire a 
more extensive MEDLINE search will be charged at regular rates. 

Cost recovery is an essential and fundamental part of the M, I -T , 
pilot operation. These services must be paid for. While some NASIC host 
institutions may choose to absorb part or all of the costs and not chc^rge 
users directly, M,I,T, has chosen to attempt direct recovery of costs. 
This policy has been adopted in part because the only sure way to test 
user receptivity of direct charges is to actually charge appropriate fees. 
We expect that many M,i,t, users will use contract or grant monies to pay 
for WASIC services and operational use bears this out. More significantly* 
^"ther users have paid out-of-pocket for services. However* there are pot en* 
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tial users, notably undergraduate students, who generally will not have 
access to research funds- We are currently exploring mechanisms for pro- 
viding funds to cover the cost of service to these people, one possible 
model r for example, of sources of funds for student use of M^SIC goes along 
lines similar to dollar support in many places of student use of computa- 
tional facilities ♦ 

A discussion comparing the assumptions underlying the user charges 
fnr service with our actual experiences appears later in Task 12, Monitor- 
ing and Analysis of Service Operations* 
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Billinq and Accounting Mechanisms (Task 6E ) 

Another large effort was invested in the development of a mechanism 
to handle the billing and accounting activities of the initial NASIC pilot 
service- The existing services and facilities of the M<I<T< Accounting 
Office are being used as much as possible. The Accounting Office is han- 
dling the financial transactions associated with NASIC service and they 
interface, for billing purposes, between NASIC and the user. Accounts 
representing each library site plus the NASIC Coordinator's Office have 
been set up through the M-I,T- Libraries and, for each account, reports 
by object class for different categories of income are furnished to the 
NASIC Coordinator by the Accounting office. However, while M,I,T< retains 
control over the financial interface between the local NASIC service and 
the charges to and payments from Its users ^ NEBHE retains control of the 
financial interface between NASIC services at M<I<T< and outside search 
services- All bills from outside agencies for services purchased by NASIC 
at M,I,T, are sent to NEBHE for payment, NEBHE, in turn, bills M,I,T, 
through the Coordinator's Office for the cost of such services plus a sur- 
charge equal to any discount received by NEBHE from the supplier, NEBHE 
is apprised of external services, purchased by users of NASIC at M,I<T, by 
either copies of orders and/or reports sent periodically to NEBHE by the 
NASIC Coodinator's Office, These reports include industrial usage of 
services; the industrial surcharge collected by M.I,T* is passed along 
to NEBHE, 

This initial phase mechanism could serve as a model for NEBHE in set- 
ting up service sites at other institutions, whereupon the local institu- 
tional accounting office handles the billing from those NASIC sites to 
users at those institutions- NEBHE would act as a collection agency only 
when the local institution could not< 

The initial phase operation is currently being tested in practice - 
and the general flow of materials and information is described below. The 
initial billing and accounting operation functions around a requisition 
and a word order < 

1, At M-I-T,, users may pay for NASIC services in several ways: 

a) by authorized requisition against an internal M,I,T* account 
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b) by purchase order from an external organization 

c) by cash receipt showing a deposit made to a NASIC account 
with the M<I<T< Bursar 

d) by personal check 

e) for M<I<T< users only, personal billing. 

In all cases/ the user is given an estimated cost of service, based 
upon an hour appointment* at the time he arranges his appointment- 

2, At the appointment/ the user presents his requisition/ purchase 
order/ or cash receipt to the Specialist/ if payment is by one of 
those modes. Then^ the user and Information Specialist together 
accomplish a work order based on the user's infornvation problem. At 
the completion of service, the work order contains the cost to the 
user for services rendered and the user receives a copy for his own 
records- The user is also told of the nature and rate of any other 
costs^ such as for off-line output, that are to be billed to him later. 
The work order indicates w^iether industrial rates apply, (See Figure B8,) 

3, The original work order is foi^arded together with any requisition 
or purchase order to the NASIC Coordinator's Office, The Assistant 

to the Coordinator prepares from it either a clean order for an off- 
line external service/ or a summary report of any on-line services 
already rendered, A clean order is sent to the external agency pro- 
viding the offline service. Copies of a clean order and/or a summary 
report are sent to NEBHE as notification of services purchased exter- 
nally by M,I,T, on accounts maintain-ad by NEBHE with those agencies 
for M, I ,T, use, 

4, when NEBHE receives an invoice from an external agency* ITEBHE 
makes all payments to the agency and forwards the bill for those ser- 
vices together with any :^iEBHE surcharge to the M,I,T, NASIC Coordin- 
ator's Office, The Coordinator's Office verifies and forwards the 
bill to the M,I,T, Accounting Office for payment to NEBHE, 

5, In the meantime/ a copy of the user's work order* with object 
classes entered against services by the Coordinator's Office, is for- 
warded to the M,I,T, Accounting Office, All supplementary bills,, 
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such as for off-line output, are prepared by the Coordinator's Office 
and forwarded to the Accounting Office, The Accounting Office charges 
the user's account and handles all collections, account transfers, and 
refunds- The Accounting Office furnishes reports to the M,I»T, Libraries 
and its NASIC Coordinator's Office, 

For MEDLINE services, billing and accounting function as descr-ibed 
above except that a) NEBHE is not in the flow and b) bills from the 
National Library of Medicine are received direcciy by M.I-T, 

As an aid in the computation of charges, Specialists have rate sheets 
showing the cost by minute for on-line connection to each data base, and 
for Information Specialist time, and the cost by page cr citation for off- 
line printout* A rate sheet for CA Condensates is shown in Figure B12» 
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Data Gathering Procedures for Management Information^ Qualitative 
Evaluation, and Statistical Summaries (Task 6F ) 

The initial operational procedures described in this report have 
been set up with monitoring and evaluation of services in mind* Data 
is beinc^ yciLiiered in several ways for input to an analycic effort (see 
Task 12 below) - A nuirber of forms have been prepared to capture data 
during or immediately after an event* Several of the forms are included 
in Appendix B of this report* Envelopment of the initial data gathering 
procedures is near completion* 

Because of the centralized appointments mode, KASIC pioblicity 
names the Coordinator's Office as the place to ccntact for all inquiries* 
An inquiry data sheet has been prepared to capture, for example, data 
about the caller, whether by phone, by mail, or in-person, his location 
and iatatus, the nature of the inquiry, and the responses given* if an 
appointment is made, then date, time, and place are noted along with 
anticipated services, payment method, and problem title. The inquirer 
is always asked about how he learned of KASIC services* If no appoint- 
ment is set up, an attempt is made to ascertain the reason. If the 
inquirer is interested in a data base or service not yet available, his 
name, location and interests are entered into a special file so that 
we Jfnow what thccc :ire and we can personally notify him when such 
services, or closely-related ones, are made available* In addition, 
some of these user« c?,n aid us in experimentally testing new services 
under consideration, (See Figures B2, B3, and B4*) 

Many inquiries will not come initially to the Coordinator's Office, 
Many will first be received elsewhere within the library system* The 
entire library staff is being asked to refer inquiries to the Coordinator's 
Office or to the information Specialist in that library* Printed post 
cards (Figure B5) have been distributed to help accomplish this referral during 
non-busines hours* Most importantly, an orientation program for the library 
staff is nearing completion so that the staff (a) may be in a more know- 
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iedgeable position about NASIC (particularly the reference staff) > 
(b) can identify with NASIC as a major library activity, and (c) can 
help NASIC reach a wider audience, 

A user who makes an appointment is sent a reminder (See Figure 36) as to 
time and place, and, if it applies* to brlnrj a requisition or purchase order. 
The user also receives a problem statement form and is encouraged to complete 
it and return it before his appointment, (See Figure B7,) 

The problem statement form is modelled after those typically used by 
off-line search centers. The objective is to capture a natural language 
description of the problem and to define its boundaries. This statement 
is exceedingly useful if properly completed because the user has then 
thought about his problem beforehand. If it is returned in advance of the 
appointment, the Specialist will also have reviewed the problem. The 
user problem statement serves both the Specicilist and the user as a common 
ground, a point of departure for further probing in working out search 
terminology and search strategies. The narrative portion of the user pro- 
blem statement is the most important part, it is essential tnat the user 
not attempt to structure his problem at this stage by guess or preconcep* 
tion in anticipation of how material relevant to his problem is indexed. 
The narrative is more meaningful than lists of phrases or words be- 
cause it provides additional conteAw by interrelating the phrases. It 
provides essential details typically absent in search titles. The remain- 
ing portions of the user problem statement are useful for the Specialist 
in at least two ways: (1) it provides additional handles useful in 
interpreting search feedback, and (2) it can aid in increasing search 
precision (more limited results) should that be necessary. The user pro- 
blem statement is alec useful in later analysis of the search session and 
its effectiveness. 

The work order captures the essentials of the services rendered to 
the user. It contains the specific time length, charge data., and method 
of payment for a user session. The charges to the user are classed by 
object code for accounting reports. This data is essential to determining 
the cost recovery effectiveness of the pricing algorithm, (See Figure B8,) 
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An appointment session log (Figure B9) allows the specialist to cap- 
ture in free fornv important decisions and problems that arose during the 
session. Capture of details of technical problems serve as a basis for 
securing credits form external agencies as well as providing them with 
feedback on the quality of the services they are selling. 

Off-line outputs are sent to the coordinator's Office so that (a) user 
charges can be calculated and billed if these charges were not previously 
included on the user's work order* (b) identification numbers appearing 
on the printout can be deleted as a precaution against unauthorized use 
by otheigof these numbers, (c) ^ legend for interpreting coded data on 
the printout can be enclosed, (d) a thank you for using NASIC can be 
attached* and (e) an evaluation form covering services and output can be 
enclosed ^or subsequent return by the user. As of the date of this report, 
the legend ?.nd the evaluation form have not yet been completed* When full 
document delivery service is established^ a mechanism integrating it with 
search output will be developed. 

The Information Specialists are keeping detailed time breakdowns 
of their NASIC activities- This information is essential not only for 
contractual committments, but also for understanding the nature of an 
Information Specialist'-s job, and for determining costs of activities, 
(See Figure B10-) 

The Information Specialists also keep a terminal log for all log-on 
connections and printout requests, whether for training, for experimenta*- 
tion, or for operational services. This data is used in verifying the 
external agency invoices, and for determining costs of activities beyond 
that of ^ user search session proper, (See Figure Bll*) 

The data that is gathered must be reduced and analysed. Our initial 
work on analysis is presented later in Task 12, Monitoring and Analysis 
of Service Operations. 
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Marketing and Publicity (Tasks 7 and 10 } 

Plans for marketing and the development of promotional materials 
constituted a major effort and it was carried forward in cooperation with 
the NEBHE Staff. 

Early on, the characteristics of the major channels of publicity at 
M,I.T, were identified. In addition, statistics were gathered about the 
M.I.T. population by department and laboratory and by status, as, for 
example* faculty, students, staff, visitors. Information already com- 
piled at the Institute in the form of directories to the major research 
interests of the H.I.T, community was also collected. The population 
and research interest data has been used in the selection of data bases 
and in the determination of selective mailing lists by department. The 
publicity channels primarily used to-date center around news releases, 
brochures, mailings to selective identifiable groups, meetings, and dem- 
onstrations , 

Several brochures specific to NASIC at M.I.T. were prepared and added 
to an earlier brochure produced by NEBHE describing the general objectives 
of NASIC. One of the new brochures for M.I. T. gives an overall picture 
of the initial NASIC services available at the Institute, This general 
brochure is complemented by three others, each containing important gen- 
eral information but each also contains content specific to a different 
data base-^-one in chemistry, one in education, and one in business, A 
fourth brochure, prepared and produced at M,I.T. expense, covers medicine. 
A price list complements all the brochures. The price list is illustrated 
in Table 2, The brochures are illustrated in Appendix C» 

The publicity events included, in chronological order; 
1, An initial news release to Tech Talk, the M,I,T, Community weekly news- 
paper, appearing 12 September 1973 and describing the NASIC subcontract 
effort at M.I,T, This release coincided with the beginning of the Fall 
semester. The release was issued to the Associated Press and was picked 
up by at least the following national and local newspapers; The New 
York Times (23 September 1973), the Worcester, Mass, Gazette (20 Sep- 
tember 1973), and the Fall River, Mass. Gazette (20 September 1973), 
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2, At the invitation of Miss Nicholson, Professor Reintjes spoke to the 
M.I.T. Library Council on 17 October 1973. 

3, Miss Nicholson spoke about IJASIC before the M.I.T, Faculty Council 
meeting on 7 November 1973, This was an agenda item for their 
monthly meeting, 

4, A letter prepared by Miss Nicholson announcing the opening of NASIC 
services on 15 November was sent together with a copy of the General 
Brochure to ail 2600 M.I.T, faculty and staff members on 12 November 
1973 • 

5, The campus student newspaper The Tech interviewed Miss Nicholson 
and Professor Reintjes about NASIC services and ran their article 
on 27 November 1973, 

6, The letter by Miss Nicholson to faculty and staff and the General 
Brochure was sent to all 1388 M.I.T, research assistants and teach- 
ing assistants on 29 November 1973, One or more brochures on specific 
data bases were included in the mailings to all 374 graduate 
students in selected departments. The chemistry brochure was sent 

to those students in the Chemistry, Chemical Engineering, Biology^ 
Metallurgy and Material Science^ and Nutrition and Food Science 
Departments, The medicine brochure was also sent to students in 
some of those departments. The education brochure was sent to 
students in Foreign Literature and Linguistics, The business 
brochure was sent to students in Economics, 

7, An article about NASIC services appeared in Tech Talk on 5 December 
1973, 

8, William Duggan and Alan Benenfeld spoke before the m,I,T. Administra- 
tive Officers regular monthly meeting on 20 December 1973, The billing 
and accounting interface with MASIC services was emphasized. 
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9. A letter sent by Hiss Nicholson to each library staff member about the 
opening of NASIC services was a prelude to a series of orientation 
meetings held beginning in December by the NASIC Coordinator and the 
Information Specialists with the staff of each library. The meetings 
were to familiarize the staff with NASIC services and to further en- 
courage development of effective communications and coordination be- 
tween traditional library services and computer-based library services- 
Considerable emphasis is placedon NASIC as an integral part of library 
services . 

10. A Chemical and Engineerincf News reporter was referred to the NASIC 
Coordinator by the Tech Talk Office, A brief announcement appeared 
in the 24 DecemJ^er 1973 issue of Chemical and Engineering News , Some 
inquiries by non-H.I*T. people were received as a result of that announce- 
ment. 

11. A series of six online demonstrations and one seminar plus demonstra- 
tion about NASIC services were held in January during H.I.T*'s Inde- 
pendent Activities Period (lAP) by Mary Pensyl and the Infornvation 
Specialists. There were two demonstrations of the education data base* 
one of chemistry* two of business including one at the seminar* and 
two of the medical data base. The demonstrations consisted of 10 
minute ^introductory talks* a 20 minute prepared search to illustrate 
features of the data base* and a 10 minute question and answer period* 
The seminar consisted of a 30 minute discussion followed by a 20 minute 
prepared search and a 10 minute question and answer period* Approx- 
imately 120 people attended the demonstrations and about 20 people 
attended the seminar. These activities drew a mixture of m.I.T. faculty, 
students, and staff* in addition to a number of non-H,I.T, personnel, 
particularly librarians from other schools in the area. We know of 
three appointments that resulted from this effort. One H,I.T. depart- 
ment began to investigate the possibility of setting up funds for its 
students to draw upon for NASIC services. 
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12, A two-day Symposium was held on 7-6 February 1974 at M,I,T, and jointly 
sponsored by the M,I,T, Electronic Systems Laboratory, the M,I,T, 
Libraries, and the New England Board of Higher Education, The symposium 
covered two topics: (1) reports on the results of Project Intrex, and 

(2) reports on the status of the Northeast Academic Science Information Center 
Invitations were extended to the Deans of the fourteen graduate library 
schools in the northeast and to the Heads of 43 university, college* 
and other large research libraries, also all in the northeast. Each 
institution was invited to sencj two representatives. Sixty persons 
attended, Natalie N, Nicholson launched the NASIC portion of the pro- 
gram with an "Introduction to NASIC a\ M,I,T," on Thursday evening. 
Professor Reintjes introduced the Friday program which included pre- 
sentations on "The Background and Objectives of NASIC" by David Wax, 
"A Rationale for NASIC at M, I,T,*' by Richard Marcus, "The Development 
of NASIC at W,I,T,'* by Alan Benenfeld, "The Integration of NASIC into 
the M,I,T, Libraries" by Mary Pensyl, and the "Future Plans of NASIC" 
by David Wax, The program concluded with demonstrations by the Infor- 
mation Specialists of the NASIC facilities In the M,I,T, Libraries, 

13, On 11 February 1974, the Lincoln Laboratory Library Director, Mr, Lloyd 
Rathbun, was sent several hundred copies of each of the NASIC brochures 
and price list for distribution to Lincoln personnel. An announcement 
about NASIC appeared in the 15 February issue of the Lincoln Laboratory 
I^ibrary Scanner, A Lincoln Laboratory bibliographer, Ms, Sara McNeil, 
was named as a contact person for Lincoln staff* Lincoln has since 
issued a purchase order through their Library for NASIC services, 

14, The letter by Miss Nicholson was revised and it and the General Bro- 
chure were sent on 15 February 1974 to all 767 staff at the Charles 
Stark Draper Laboratory, inc, 

15, The M,I,T, Industrial Liason Office ^^n mid-February contacted by telephone 
about 20 firms in the Greater Boston area about the availability of KASIC 
services. This was a follow-up to a previous letter by ILO announc- 
ing that services to the industrial community would be forthcoming* 

ERIC 
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16* On 20 February 1974, David Wax and Alan Benenfeld spoke before the 
Special Libraries Association, Boston Chapter, Science-Technology 
Committee* The development of NASIC and its implications to special 
libraries were discussed* About 80 librarians from the Greater Boston 
area attended. Several inquiries from attendees at that meeting have 
since been received about providing services to particular libraries* 
We are aware of at least one announcement, in the Massachusetts College 
of Pharmacy Library's bulletin, of NASIC services. 

17, On 5 March 1974, a demonstration and discussion about NASIC was held 
at M.I.T. with 8 librarians from the University of Massachusetts, 
Boston, Information about NASIC services has since appeared in their 
library bulletin* 

18* On 10 March 1974, a revised letter by Miss Nicholson was sent along 
with copies of brochures and price list to Miss Helen Brown, Director 
of the Wellesley College Library for distribution to Wellesley faculty 
and staff. M.I.T* and Wellesley have a cross-registration program 
and other reciprocal agreements* 

19* On 14 March 1974, Mary Pensyl gave a presentation on NASIC services in 

the M*I.T* Libraries at a joint meriting of the Harvard and M*I.T* Library 
Staff Associations* About 60 persons attended* 

2C* On 1 April 1974, Mary Pensyl spoke on NASIC as an M*I*T. Libraries 

service to 30 persons attending a meeting of the M*I*T. Women's Forum* 

The service site in the Science Library was strikingly repainted by 
the M.I.T, Libraries, Consideration is now being given toward a similar 
effort at some of the other service sites to enhance visibility and com- 
mand attention* Plans for effective displays of brochures at each site 
are underway as are plans for poster displays* 

Word*-of-mouth advertising by satisfied customers about good, efficient, 
and effective NASIC service is a most desireable goal* To this end, the 
Information Specialist training program and the library staff orientation 
meetings have, been particularly sensitive to furthering the initial enthus- 
iasm, espirit*-de*-corps and cooperation among all participants, backed up 
by the development of coordinated procedures for effective service, 

O 
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Our experiences to date indicate that all publicity mechanisms will 
yield some response but that word-of *mouth advertising is gaining in pre- 
dominance- As Phase 2 gets underway, consideration is being given to new 
avenues of publicizing services, in particular, personal contact with poten- 
tial prime user groups. We also plan in Phase 2 to survey both users and 
non-users of nasIC services. An initial evaluation of the marketing effort 
and some plans for further effort are given later in Task L2, Monitoring 
and Analysis of Service Operations, 
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Testirig of Service Procedures (Task 9) 

During^ the two week period preceding the opening of services, dry 
runs on parts of the operation were held. These were concentrated en- 
tirely on interviewing and performing searches with trial users- Testing 
of all other procedures is being done under real conditions of service as 
part of the continuing monitoring and analysis of the operation (see Task 12)- 



ERLC 



Initiate Service (Task XI) 

This task represents a major milestone in the history of wasic. 
On Thursday, 15 November 1973^ NASIC at M.I.T. officially began hjervices 
by taking appointments for service beginning Monday 19 November. Inter- 
estingly enoughs inquiries were received at the Coordinator's Office 
on 14 November frotr. faculty and staff who had already begun to receive 
the announcements sent to them. Service operations were launched just 
four months after the start of m.I.T.'s subcontract. Task 12 below 
summarizes and analyzes the activities following the start of a service 
operation. 
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Monitoring and Analysis of Service Operations (Task 12) 

The discussion of each of the preceding tasks has been mainly a 
descriptive report of the design, development, and implementation activities 
associated with NASIC services. In this task we look at the service opera- 
tions themselves and compare design with practice* Feedback from monitor-- 
ing and analysis of operations is vital not only to making changes to the 
operation or to the policies governing the operation, but also to under- 
standing better the functions being performed and the needs of the user 
community- Because we have been on-the-air for only a limited time, feed- 
back has been useful so far mainly to improve understanding and to modify 
general plans; few changes of consequence have been made as yet to either 
policy or to operation. The following analysis draws upon the statistical 
data appearing in Tables 3 through 9, upon the descriptive reports above, 
and upon user receptivity reports. 

It may be helpful to begin with a statistical characterization of use. 
All statistics refer to the period 15 November 1973, the date services began, 
through 28 February 1974, 

There were 57 users; of these ^ 29 used one of the three NASIC data 
bases and 28 used the MEDLINE data base. All searches were online retro- 
spective searches. There were no current awareness searches, either online 
or offline, and no offline retrospective searches* All searches but one 
were run on an appointment basis with the user present. The one exception 
was a second search for a user run in a delegated mode using the INFORM 
data base. Table 3 shows the breakdown of users by data base and by the 
library in which service was received, .Table 4 shows the breakdown of 
users by organization and status for each data base. For M»I,T, campus 
users. Table 5 shows for each data base the distribution of data base 
users by department or laboratory* Table 6 shows the publicity mechanisms 
to which M, I, T»-af filiated users responded* Table 7 shows the methods of 
payment selected by all searchers distributed according to their status- 
The data in Tables 6 and 7 do not distinguish between t5ASIC data bases and 
MEDLINE because the data base was not expected to influence the distribu- 
tion. 
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NASIC hX MIT 
SUMMARY DATA 
15 NOVEMBER 1973 TO 28 PEBHJARY 1974 



NUMBER OF SEARCHES: 29 NASIC, 28 MEDLINE 



(ALL ARE ON-LINE RETROSPECTIVE SEARCHES) 



SEARCH LOCATION P^D DATA BASE : 

BARKER 
DEWEY 

HUMANITIES 
ROTCH 
SCIENCE 
OTHER LIBRARY 
OFFICE/LAB 

TOTAL 



CHEM 



ERIC 
1 
2 
1 



INFORM 



NASIC 
TOTAL 

10 

6 
1 

6 
6 



MEDLINH 



28 



14 



10 



29 



28 
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Table 4 

NASIC AT MIT 
SUMMARY DATA 
15 NCVEMBER 1973 TO 28 FEBRUARY 1974 



USER AFFILIATIONS : 

MIT/CAt4KJS TOTAL 
F AOJ LTY 

GRAOJATE SIUDENT 
UNDERGRACUATE 
OTHER STAFF 



CHEM 
12 
5 
2 



ERIC 



INFORM 



NASIC 
TOTAL 

24 

5 

9 

10 



MEDLINE 
25 

6 
11 

4 

4 



MIT/LINCOLN 

DRAPER 

WELLESLEY 

OTHER UNIVERSITIES 
(TOTAL) 

FACULTY 

GRADUATE STUDENT 
UNDERGRADJ ATE 
OTHER STAFF 
GOVT. AGENCIES 
INDUSTRIAL/COMMERCIAL 
OTHER AFFILIATIONS 
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Table 5 



MASIC AT MIT 

SUMMARY DATA 
15 NCVEMBEU 1973 TO 28FEBPUARy 1974 



DEPARTMENT PFF ILIATIONS OF 
MIT/CAMEUS USERS ; 

DEPARTMENT 
BIOLOGY 
CHEMICAL ENG- 
CHEMISTRY 
ELECTRICAL ENG, 
MATHEMATICS 
MECHANICAL ENG. 
METALUJ RGY 
NUCLEAR EMG, 
UUTRITIOM 
OCEAN EMG, 
SLOAN SCHOOL 
URBAtr STUDIES 
HEALTH SCIENCES PROG- 
LIBRARIES 
MAGNET LAB, 
OASIS 

PLANNING CFF . 
SEA GRAt^T 



CHEM , 
1 
1 
2 



ERIC 



INFORM 



1 

4 



NASIC 
TOTAL 

1 

i 

2 

2 

2 
1 
1 

1 
5 

2 

1 ' 

1 

3 



MEDLIME 
1 

1 
1 
1 

5 
1 
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Table 6 

NASIC AT HIT 
SUMMARY DATA 
15 NO/EMBER 1973 TO 28FEBR1ARY 1974 

PUBLICITY RESPONSE OF MIT-ff FIU;- :'£D USERS : (Combined NASIC and MEDLINE) 

CAflFUS LINCOLN DRAPER TOTAL 



FACULTY GRAD, UNDERGRAD, OTHER LAB LAB MIT 

MAILINGS (LETTERS AND/OR 

BROCHURES] 3 4 2 3 — — 12 

THE TECH ARTICLES — 2 1 1 — — 4 

TECH TALK ARTICLES — 2 1 — — — S 

COLLEAGUE 16 2 1 — — 10 

DEMONSTRATION — 4 — 1 — — 5 

LIBRARY STAFF REFERRAL 2 2 — — — — 4 

DISPLAY/OBSEFVER IN 

LIBRARY 15 1 2 ~ — 9 

*P0STERA>ISPLAY OUTSIDE 

LIBRARY " — — — — — 0 

REPEAT USER 5 ~ — 4 — — 7 

OTHER SOURCES " " — 1 — — 1 



*Not implemented in the reporting period. 
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METHOD OF PAYMENT FOR SERVICES: (Combined NASIC and MEDLINE) 



NASIC AT MIT 
aJMMARY DATA 
15 NOTEMBER 1973 TO 2\l TEBRJARY 1974 



UNDER- LINCOLN DRAPER 

FAgJLTY GRAD, GRAD. OTHER LAB LAB ACAD, COMM^ 

Ml'i REQUISITION 11 15 1 12 ~ ~ 1 

PERSONAL CHECK ~ 4 3 2 ~ — 6 

CASH ~ 1 

PERSONAL BILL THRU MIT 

PURCHASE ORDER ~ — ~ — — " ~ I 
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Table 8 

NASIC AT MIT 
NUMMARY DATA 
IS NOVEMBER 1973 TO 28 FEBRUARY 1974 



CHARACTERISTICS OF ON-LIHE SEARCHES BY APPOINTMENT ; 



SEARCHES BY APPOINTMENT: 



CHEM 
14 



AVERAGE ADVANCE TIME IN ARRANG- 
ING AN APPOINTMENT; (business days) 6-4 



ERIC 
5 

3.2 

36 

62 



INFORM 
9 

5.8 

29 

76 



NASIC 
OVERALL MEDLINE 



28 

5.7 
37 

70 

.53 



AVERAGE CONNECT TIME: (minutes) 42 

AVERAGE APPOINTI^ENT LENGTH 
(minutes) 77 

AVERAGE RATIO CONNECT TIME TO 

APPOINTMENT TIME: ,55 .58 .38 

APPOINTMENTS WITH A MACHINE 

PROBLEM: 3 0 0 



AVERAGE TOTAL PROBLEM TIME: 

(minutes) 14 — — 14 

OFF-LINE ir'r.TNT REQUESTS: 7 4 5 16 

AVERAGE OUTPUT (pages) t 44 44 28 39 

(citations) : 211 110 B7 131 



AVERAGE COMPUTER PLUS ADMINIS- 
TRATIVE CHARGE: 
(before any allowances) 

AVERAGE SPECIALIST CHARGE; 
(before Introductory Credit) 

AVERAGE PRINTOUT CHARGE: 



AVERAGE USER COST: 
(assumipg offline printouts 

requested and no credits given) $55.89 $40.50 $47.98 $50.47 



28 

5.2 
47 

71 

.66 



13 

20 

50 
394 



$39.52 $26.95 $32.22 $34.90 $13.86 

$10.25 $ 8.22 $ 8.58 5 9.36 $ 9.32 

$ 6.12 $ 5.33 $ 7.18 $ 6.21 ? 4.53 



$27.71 
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It is helpful to characterize further the search ses?^ions within 
a search mode. To date, we have had essentially only one mode, an on-* 
line search by ap^ i.trr.vnt . Table B characterizes quantitively the ft^atures 
of th^ 28 NAbIC d^.L.i baso searcht'S :xr.d the 2ii MEOilNE 5;earc:he*^ th:\t were 
held by appointment and vith the user pre.s^nt. TJu: chargesi vary widely 
among the data-bases. The amount oi offline printout also iippears to be 
influenced by the size and the number of years covered by the data base. 
On the other hand, othei parameters, which we shall examine, show little 
difference in usage among the data bases* Some of the fluctuations we do 
see may be attributed at this early stage to the one or two users at the 
extreme for each data base who have influenced the average values reported 
here. For example ^ the 394 citations printed offline per average session 
for HEDLIUE is heavily weighted by two users out of 28 who received 1431 
and 1338 citations respectively * excluding those users, the average MEDLINE 
printout contains 282 citations. This is still a large number but the 
medical data base is quite extensive and the figure is comparable to the 
more than 200 average citations printed for users of the similarly exten- 
sive chemistry data base. With a larger number of users, extreme cases 
will have less influence on the average values*-- or, indeed, we may discover 
that these ''extremes" are not really abnormal. 

The appointment itself runs some 70 minutes, of which 37 minutes, a 
little more than half the time, is spent in an actual on-line connection 
to the retrieval system, in our case either the SDC OPBIT or the NLM or 
SUNY MEDLINE Systems, About 10% of the searches have encountered more 
than minor machine connection problems * occasional disconnect or line noise 
problems are more frequent but they are considered minor if recovery is 
almost instantaneous and if the search is unimpeded. If machine problems 
impede the search in any way, the user is given credit against his total 
charge for the amount of time involved. 

Only 50% of NASIC users and 60% of MEDLINE users request offline print- 
outs* The request is issuo?d online during the search session but the print- 
ing is done offline on a high speed printer and sent by airmail. For users 
who have a lengthy list of references, it is often cheaper to obtain off- 
line printouts rather than by printing online at 30 characters per second. 
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However, users do obtain enough particulars online so that their work 
can proceed until the printouts art; rv-'civod a fev,' days later (on rare 
occasions, the next day) . The .?.ver\r :■ . ; 1 1.. r^\i\u '3t runs some ^^-^ to 
50 pages < The number of citatiom^ . ^j: .< '.vi Lii th^^ coniprL=*hensiveness of 
the data base- The reason for cit least 5one users not issuing an offline 
print requ(rst is that no relevant r^aterial has turned up in the iLSarch 
[.rocess. These negative searches are not at all uncominon or unexpected 
arriong research-oriented users who are often simply seeking reassurance 
that no one else has done what they propose- Although we have not spe- 
cifically analyzed for the number* the percentage of negative searches is 
probably at least 20 or 25 per cent and may run as high as 40 or 50 per 
cent < 

The average total charge to a user is $50,47 for searching a NASIC 
data base and $27<71 for searching MEDLIHE, This is the charge assuming 
offline printouts are requested and without :ialcing any credit allowances - 
In this period* M,I<T, users receive an introductory credit for the Spe- 
cialists's time and so the real cost to an M,I<T, user has been about $9 
less in each case- The average cost of the printout component is $6<21 
for NASIC data bases and $4,5:3 for MEDLINE, The MEDLINE printout is a 
per-page charge. The NASIC printout cost has been confounded by the change 
from a per page to a per citation charge as discussed under Task 6D* 
Service Charges < since the per citation rate was in effect only for 
February* we expect the average NASIC printout charge to rise- The major 
component of the total cost to a user is the charge covering computer 
connection* computer searching^ and M<I,T, administrative costs (cf< Task 
6D, Service Charges), For NASIC the average charge for this component is 
?34,90 and for MEDLINE it is $13*86, The MEDLINE cost is considerably 
lower because the National Library of Medicine subsidizes entirely the com- 
puter search cost and because the connection cost is lower than the connec- 
tion costs to SDC- 

What affect does price have on data base usage? This is perhavs the 
inost important question we can address. The availability of the subsidized 
MEDLINE system to the user community through the same channels as the NASIC 
data bases gives us some additional handles with which to begin answering 

erJc 
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the question. At this point in time, there is no dofinitive answor bt?causo 
price is ont> of many conf oundinrj factors that may motivate a person to u^<i 
corputer-based reference servicers such as UhSlC and N!EDLINE» For example, 
l.ricc, n<,»rkcting, need/ prior familiarity with data base, availability of 
runcls/ comple^xity of the ^iearch, urgency of results i convenience, and in^ 
flu&ntial or t^eei: u3orS/ are all candidates to influence at least some 
users* Price is the easiest to measure and study but we do not yet know 
whether it is the most important of all factors or only one of several 
iniportant factors* There may be threshold effects associated with E>rico 

other factors which influence usage* What can we tell at this time 
from our initial data? 

There is no initial onslaught of users for any data base* The number 
of MEDLIt^E users runs neck and neck with the total number of NASIC data 
base users* However MEDLIt^^t; is used twice as heavily as the most frequently 
used NASIC data beise in chemistry'* Price could explain the difference 
between CHEMCON and MEDLINE usage but other factors may be at least as, 
or even more, influential. One such factor is the proportion of the total 
comiTiunity interested in a data base* For example, the chemistry data base^ 
CHEMCON, is used three times as often as ERIC, and the INFORM business 
base is used twice as often as ERIC* Both CHEMCOH and INFORM are more 
expensive than ERIC^ but more people at M*I*T* are active in chemistry 
research than in management rer^earch^ and n^ore in the latter than in educa- 
tional research. Thus for these three areas ^ usage ranks in order of the 
size of the interested population^ and not inversely with cost* what 
about interest in the medical research area? At iyi*I*T*, interest in medical 
and health care research is known to be highly diffused throughout almost 
all departments* This is confirmed by the distribution of users shown in 
Table 5* Interest in modical science and technology and related health 
care systems is at an all time high nationally and at :M*T* These are 
areas where research funds are more fluid but there is also a genuine trend 
at M,I*T** as elsewhere, to turn the attention of more of its research re- 
sources to social and environmental problems* Thus the greater usage of 
mDLllX may also result from and be proportional to the number of people 



-51- 



having interest in it rather than it being an effect of prices Gven when 
the price is heavily subsidized. The explanatioii based on interest is con- 
sistent not only between NASIC and MEDLINE but also between the respective 
NASIC data bases. One way to test this hypothesis is to compare the ratios 
of data base use to the population ratios of fields of interest of research- 
ers. We hope to tackle this in phase 2 of our work but the test is com- 
Eilicated by the fact that today's highly interdisciplinary research shows 
little respect for departments, laboratories, and even data bases initially 
organised and named along lines of mere traditional disciplines. 

If interest can explain data base use Least as v;f:ll as price, are 
there results for which price is more clearly a major agent of influence? 
Analysis of Tables 4 and 7 sheds considerable light on this area. The 
status of our users appears to be influenced by the price, of 49 M-I,T. 
campus users (24 for NASIC data bases, and 25 for MEDLINE) only 4 have 
been undergraduates and all 4 used MEDLINE- Undergraduates and graduate 
students each represent about a third of the M.I.T. professional body , 
the faculty about ten percent, and the research and administrative staffs 
the remainder- Thus undergraduate use of HEDLIKE is still loss than 
proportional to their population. On the other hand graduate students have 
used all data bases, and except for cluenistry , in greater proportion than 
their population. More graduate students are involved in research than 
undergraduates. But even more significantly, graduate students have greater 
access to contract and grant: monie^=! than do undergraduates. As Table 7 
shows f 3 of ^ undergraduates paid out-of -pochet (by check) for their searches, 
but only 25 percent (5 of 20) of the graduates paid out-of-fiocket. Table 4 
shows faculty and other professional staff (research and administrative) 
use. Other staff use has been much higher than expected- From Table 7, 
all faculty users charged services against contract f unds ^ as did 6 out 
of 7 other staff people- Thus ^ price appears to influence the class of 
user. 

If this is true, then it becomes r.^cessary to find funding mechanisms 
to pay f partially or completely, for searches by an undergraduate or any other 
potential user who lacks personal funds and has no recourse to grant monies. 
This area is currently being addressed at M.I,T. The problem is similar 
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to the one of providing funding for underoraducitG u^;e of computational 
facilities . 

Is there any other discernible major influence of price? Price appears 
influence tht.^ type of service users seek. To dt"te we have had no use 
or offline services. A review of Table 2 readily shows that the offline 
service?^ currently offered are more expensive to use than online services. 
The offline prices are for search periods of one-year of a data base. In 
the online mode^ the cost is per connect-hour and it is possible to search 
several years of data base coverage at one time. The connect times in 
Taitsle B show that the average online search is accomplished in well under 
one hour of terminal connect time. Terminal connect time would have to 
increase by an extraordinary percentage to exceed the cost of a comparable 
offline search. While we have not yet performed online current aware- 
ness searches, there is no question that the above argument on relative 
cost still remains valid. Our users have recognized these differences 
in relative cost and some of them have made explicit cominents to that 
effect. But the issue is also confounded by questions of convenience. 
There is no question that online searches give the user faster results. 
The searching process is highly interactive. Th^-re is immediate feed- 
back. The user obtains results as his search proceeds. There is no 
delay except for additional extensive printouts {the delay in mailing is 
only one-way because the search itself has already been performed) . 
Thus/ online services are not only lower in cost/ thev are also more 
convenient. It is worthwhile re-emphasizing at this point that NASIC 
represents serv^ices available on sy*3tems maintained by others whose costs 
are shared widely. The commonly V^eld view in m.any quarters that online 
systems are more t^xpensive than offline systems just does not carry ov^r 
to the kinds of computer based reference services that NASIC activities 
represent. 

Does price affect the'wayin which ^ data base is used? It appears 
from the data in Table 8 and the above discussion of that table, that 
the characteristics of searching a data base, at least in the appointment 
mode/ arc not influenced by price. Users <3o not search a data base for 
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a longer tin^e simply because it costs less money. If they did^ the 
average connect time for each data base should show a progressive in- 
crease with decreasing cost but it does not (cf , Table 2) . It is much 
more likely that the size and comprehensiveness of the data base being 
searched influence the connection time. Indeed, the bigger the data 
base^ the greater the connection time, INFORM is as expensive as 
CHEMCOt-f, but it is the smallest and least comprehensive data base; it 
has the lowest connect time. The other side of the coin, namely the 
time the user spends with the Specialist in his appointment but not 
in an online connection, is not as amenable to interpretation* We 
suspect that this time, which is largely reviewing the problem and set- 
ting up an initial strategy, may be a function of the kind of indexing, 
the vocabulary controls* and the retrieval aids that are associated with 
each data base. Much more study is required in this area, particularly 
because it may turn out that other factors limit the total time avail- 
able for both pre-search strategy plus on-line search, and that these 
limits ^ in turn, influence the proportional breakdown. Some factors 
that might influence the total time limit on pre-search plus; search are 
psychological conditioning of Specialist and user to expect about one^" 
hour, anxieties about exceeding an appointment block, and external boundary 
conditions such as having another commitment scheduled, perhaps because 
of prior expectations of appointment length- 
why hasn't there been greater use of computer-based services during 
this ^ month period? One might suspect that one reason was timing. 
Thanksgiving, fall semester finals, Christmas recess, the three-week 
Indepersdent Activities Period in January, and a recess between lAP and 
the spring setr.ester all fell within this period. However, business did 
net suddenly boom once the second semester began in early February, 
Three factors much more important than timing may provide the answer; 
(1) paying for services, (2) data bases of interest, and (3) market- 
ing. Earlier, we said that price alone cannot explain which data bases 
are more frequently used. However, tli^^:; fact that users ^^e asked to 
pay for the services they receive* services that traditionally have been 
"free," may be a psychological, as well as an economic, barrier. Potential 
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users who come to NASIC have airaady overcoine the barrier- Additional 
marketing will help others over the barrier, once that barrier is over- 
come, the particular data bases of interest to a user will be searched^ 
at least as long as their absolute cost is not unreasonable in terms of 
value received. The people who come to us want service. The rt^lative 
cost of different bases may be less important to potential users than 
the acceptance of having to pay for service in the first place. At 
present, direct payment for information (copying services excepted) 
is new to most p^3ople> But administrators of information Sources know 
all too well that their cost is a cost actually borne by users indirectly 
through institutional overhead or by taxes , We will not get into the question 
of whether the cost of all information services should be borne directly 
or whether some or most such services should continue to be paid for 
indirectly* But shortly we will address that question with respect i:.o 
computer-based literature searches. Firsts let's consider whether there 
is a value to having t' ese services- If there is none^ cost would be moot* 

Almost all of these computer-based services have a printed counter- 
part such as Chemical Abstracts for CHEMCON or CA-Condensates, index 
Medicus for MEDLINE, Research in Education for ERIC, In tact, the machine 
readable data base is really a by-product of producing the printed form* 
VJhat then are the values gained by working with the machine version that 
are present to a lesser degree or absent when working with the printed pub- 
lication? The major values of a machine-based search are: 
1< Less time required by a user to do a search, especially if the search 
is by subject- 

2< Only one search need be performed for the cumulated years of coverage 

of the data base> 
3* Physical manipulation of multiple volumes is avoided* 

4> Complex logical combinations can be used that are difficult or impossible 

to carry out in manual mode. 
5- Typically * the indexing and retrieval mechanisms go beyond those present 

in printed form so that there is greater accessibility to each item 

(the number of access points is increased). 
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6, Computer printout generally eliminates the need to take notes or 
copy citations » 

Time saved and accessibility are the two most important* Unfortunately 
today in academic environments ^ people's time is less highly valued than 
say, in industry, but this is changing » There is greater emphasis on 
productivity in the office^ laboratory and classroom and this will, 
in turn, influence research in the library. Accessibility to records is 
improved by deeper indexing and by machine techniques but it requires 
particular knowledge^ understanding, and training to be used properly. 
Users of machine services, by so doing, can improve their skills in con- 
ducting searches in any medium and thus their use of the library in 
general can become more effective* Tnis is very important because mach- 
ine services are in no way a replacement of traditional services; each 
type of service compleinents the other* 

We have no doubts about the value of machine-based services to users. 
Do users have doubts? Because of the small number of users so far, it might 
seem that potential users do* However, our initial actual users have been 
highly enthusiastic about the services they received/ more on this later » 
What then is the problem? For one thing, it takes time for a new service 
to make itself known and co build up a consumer group* We began our opera- 
tions with only four data bases. We need to make available a larger number 
of data bases of interest to more segments of the community* We are 
proceeding early in Phase 2 to train Specialists in working with additional 
data bases. In two years we expect to have about twelve to fifteen such data 
bases available. 

But just making more data bases available will not in itself solve the 
problem. An improved marketing effort is essential in order to develop an 
initial thrust of interest in each new data base* Over the somewhat longer- 
range, word-of-mouth should become more and more effective as a source of 
new users* An initial core of satisfied users need to be obtained (while 
taking pains to see that initial service is of high quality and effective 
because it is much more difficult to overcome negative reactions.) The 
marketing effort to-date has relied on a number of devices, mainly that 
of written materials (letters and brochures) sent to potential users. 
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demonstrations / and news stories (refer to the lists under Task 7 on Market- 
ing and Publicity) , Table 6 shows that all publicity mechanisms yield 
some response in the M,I.T, community; in addition, a monthly breakdown of 
tliis data would show that word-of-mouth by satisfied users is gaining in 
predominance with time. Within the library, referral by library staff, 
displays, and demonstrations have been important avenues- Of 49 H.l-T, 
users, seven have been repeat customers (a form of self-referral and an 
ultimate test)- But the publicity for reaching users who haven't come 
into the libraiy '-jenerally has been passive. We personally know of several 
faculty and other staff members who received brochures by mail but who, 
when asked informally about the mailing, have little or no recollection 
of it. The mailings have gone unnoticed or unread by many recipients. 
This form of advertising, while having a definite role, is not sufficiently 
dif f erentiable from all the other mail received by our potential users. 
A boost is needed to create awareness of WASIC among prime user groups. 
Personal contact and follow up is a necessity. However, this does not 
mean a hard sell which probably would not go over well in academia. It 
does mean a personal touch which can tune into specific needs of the poten- 
tial user, l^ew publicity mechanisms must relate more heavily to specific 
and "Current needs of potential users, such as in thesis work, contract 
preparation, or course development, to motivate them to become real users. 
In addition, it is important to emphasize more the values of a machine 
based search since these values are not necessarily obvious to users. 
In Phase 2 we plan to undertake a more active publicity campaign. This 
will includev among other techniques, meeting with faculty individually or 
in groups, phone-call follow-ups to mailings, and enlisting the aid of sat- 
isfied users. Libraries usually have not actively promoted their servicer: ^ 
But for a new product such as computer-based searching to get of f--the-ground 
and find acceptance within the community, active promotion is a necessity. 
It may even attract new users to other' library services. 

We know from the short-time that we have been on-^the-air that there 
is a demand for the kind of services being offered. That there is some 
strength to this demand can be adduced from the fact that many users are 
paying out-of"pocket for such services, although the majority of users to 
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date have access to contract or grant monies. We also know that a greater 
number of data bases and a more personal marketing effort can be expected 
to improve the growth of service use. Given the value and the use, who 
should pay for services such as these? This is a question that each insti- 
tution must, of course, answer for itself in terms of its own size, budget, 
and funding mechanisms. To take on an extremely simple and conservative 
calculation, suppose the expen^5ti^ associated with a typical search includ- 
inc| all direct and indirect costs for computer time, communications , direct- 
labor, administration* terminals, advertising, and materials, is $75. 
If a modest 500 searches can be expected in the course of a year, then 
$37500 is needed to cover expenses? for 1000 searches, $75000 is required. 
The reader can make multiplications for other demand rates but it should 
be obvious that a dollar level is required for complete subsidization of 
even moderate use that could be difficult for an organisation to raise in 
today's economy on top of already seriously strained budgets especially 
where the cost-effective benefits may not be apparent for this new kind 
of service. True cost effective calculations will require looking at 
the user, and the time values of the user, as part of the service system. 
If a library cannot completely subsidize these services, then initially 
at least, it may be better to offer services at fees that recover most 
cost^ ctiid by so doing, demonstrate to higher administrative levels that, 
even with a charge, there is an actual demand for these services. Simul- 
taneously, other mechanisms can be sought to support all additional costs 
through rearrangement of budget priorities, through cost-sharing with de- 
partments, or through entirely new avenues. For example, M.I.T. is, at 
least currently as an introductory offer, subsidizing the time of the In- 
formation Specialist with all other costs charged to users. Many other 
ways to share the costs are possible, and there also are many ways to set 
up a pricing structure but it is not our purpose in this report to review 
them. It i£ our purpose to point out that a new service of value to users 
and in demand bv them, is available and that it can be offered by a library, 
even if the user must be charged some, if not all, of the cost. 

If a library chooses not to offer such services, other campus organiza- 
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tions such as individual departments or a computation facility probably wilif 
and for a fee too. If a library relinquishes this kind of seivice^ it would 
bo, in our view, unfortunate and indeed a diss(=^rvice to the campus and to the 
profession for the following reasons; 

1. librarians represent the bibliographic expertise available to the 
community 

2. these services are, in effect, a powerful cataloq and index to materials 
available through the library 

3. information needs of user^ require availability and integration of diverse 
resources ^ one complementing another 

4. reparation or isolation of these resources undermines the ability 
of any one resource to be used effectively 

5. the end user suffers because different organizations can each satisfy 
only a part of his information needs* 

Let's turn to other analyses that can be made at this time of the M-I.T* 
operation. Table 8 tells us that the average number of business days 
between the time a user arranges for and holds an appointment is 5.7. The 
median wait is only 3 business days? that is, half of the appointments are 
held within 3 days of the user*s call. The average wait has been affected 
by several users who found it necess?.ry to reschedule their initial appoint- 
ments or who otherwise had full calendars and found it difficult to initially 
arrange an appointment for a time slot when the data base of interest was 
aXsc up* The waiting time may seem excessive but no tincr has requested 
immediate service* It is possible, however, that we may not be hearing from 
users who want instant or demand service such as they get themselves by 
searching on their own manually in the library. Nevertheless ^ we suspect 
that there is a greater tolerance on the part of most users for waiting for 
service than is commonly thought* On the other hand, if the median wait 
approaches a week or '^ore ^ thir ic likely to be excessive and beyond toler- 
ance. The tolerance of users may be because they know they will obtain 
syst;em feedback at the time of their appointment. In our initial operations, 
a user could receive service for any data base, except MEDLINE, in any 
divisional library. However^ the time of the appointment rather than its 
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location may be more important to a majority of users, although several 
users have indeed requested iService in a particular library. The issue 
of service facilities geographically convenient to the user is clouded 
by the SDC three*-hour time windows during which only certain data bases 
are accessible. We hope to clarify the picture of user convenience during 
Phase 2, We also wish to experiment with providing service directly in 
the office or Ictboratory of a uiier. In addition to time and location 
of an appointment, some repeat users have specifically requested service 
with the same Information Specialist who served them earlier; this makes 
good sense particularly if the same uner problem is to be searched against 
another data base. It also represents the beginning of a professional 
relationship between Specialist and user* 

There ic additional data to draw upon which will help to complete 
the picture of MASIC operations at M.I,T*, particularly in terms of background 
functions and duties* For exar.ple, a user who either phones in to, or stops by, 
the Coordinator's Office spends from 5 to 20 minutes converiiing with the 
Coordinator's Assistant about the services available, their cost^ which 
one(s) are applicable to his problem^ arranging an appointments and the 
nature of the user problem statement. Two or three subsequent calls are 
often necessary/ but these are generally much briefer, and the nature of 
these conversations mainly is to confirm an appointment, remind a user of 
an appointment/ or request return of the problem statement. 

The interaction that Lranspires is an essential^ even influential , 
element m the marketing process, A potential user already has been moti- 
vated to make an inquiry. Some of these people are prepared to arrange 
an appointment, others need further reassurance that NASIC services will 
be helpful to them, and others cannot be served at the present time by the 
available data bases but may be candidate users of an expanded service, 
A file of prospective users and their interests is being kept. 

The centralised rode for dispersing information and for appointment 
arrangements has been most beneficial in lessening the burden that other- 
wise might be carried by the Specialists and other personnel in each 
library* The Specialists can concentrate more on the actual service than 
on arrangements Zor service. In those cases when the first point of contact 
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of a prospective user with NASIC is within the library through a staff 
member or by seeing a display ^ the prospect is referred to the Coordinator's 
Office for more detailed information- 

The Specialists spend about ten to fifteen minutes before an appoint- 
ment reviewing the user problem statement to f airtiliarize themselves with 
the problem- This nay include consulting printed forms of a data base or 
associated thesauri and other retrieval aids- In several cases, the Spe- 
cialists have also conducted online pre-appointment searches of about 10 
to 15 minutes to test out vocabulary and strategies- These pre-appointment 
searches tend to be undertaken when the Specialist is less familiar with 
a specific subject area or its treatment in a data base and this may be 
considered part of the learning process- We expect that this type of 
activity will decrease as the Specialists continue to gain experience 
in providing these services- A natural question to ask is what effect 
a pre-appointment search has had on the effectiveness of the actual 
search during the appointment, but we have not yet looked into this, 
either quantitatively or qualitatively- 
After an appointment, a Specialist may spend up to an additional half hour 
documenting the major events transpiring during the appointment- These "minutest 
of the session,'' so to speak, are an important tool for any later 
analysis of the sessions, the problems (technical or logical), and the 
reactions- If a user does not wish to keep the print record of his search 
(a rare occurrence), then these are made part of the documentation- in 
the absence of that record, the Specialist generally notes the strategy 
that has been used- These can be most helpful later for training new 
Specialists, for demonstrations, and for referral from a similar problem 
by another user- 

The total pre-appointment plus post-appointment activity of a Specialist 
may be, on average ^ about an hour or almost as long as the average appoint^ 
TT^ent itself- Some of the pre-appointment time is a result of inexperience- 
A good portion of the post-appointment is a direct result of the experi- 
mental mode in which M-I-T- is studying and testing these kinds of ser-^ 
vices for NASIC. 

None of the costs associated with the Specialist's pre-appointment or 



-61- 



post-appointment activities are charged directly to users. The pricing 
structure < which was described in Task 60* inclp.ded within the derivation 
of the hourly rate fcr the information Specialist, about 10 minutes for 
pre- session plus post- session activity* Even with allowances for in- 
experience or for testing activities! the 10 minute estimate is too low* 
A better estimate of steady-state, non-testing, activities occurring either 
before or after an appointment is probably 20 to 30 minutes. Thus * the 
hourly rate charged for a Specialist's time should be higher for opera- 
tional cost recovery* In addition, if it becomes clear that there is a 
normal role for pre- appointment searches, then the cost for this computer 
activity will also need to be included within the pricing structure, 
either by adding that connect time to the appointment connect time, or 
by increasing the administrative surcharge within the computer searcn 
component of the pricing structure* VJe expect to look into this area 
more closely in Phase 2» 

In addition, there are interactions which transpire between the 
Coordinator's Office and the Specialists* Each interaction is generally 
brief but the number isi in part, a function of the volume, and their 
aggregate represents Specialist time yet to be accounted for in the pric-* 
incf structure for true cost recovery* 

Another element of interaction is a weekly two-hour meeting of the 
Specialists with the coordinator and with Laboratory staff to review the 
operational activities and associated problems, to discuss further devel- 
opment and testing and to continue training* In a steady^state operation, 
this activity should be less frequent. However, periodic review meetings 
between Specialists and Coordinator represent an activity that should also 
be accoiinted for in a revised pricing structure* 

The initial pricing structure was also based on operations requiring 
a Coordinator estimated at 20 percent full-^time and an Assistant to tne 
Coordinator estimated at 60 percent full-time (cf»Task 6d) • Our experience 
indicates that the duties performed by the Assistant reflect, in actuality, 
a full-time position. The Coordinator position is less cXear bat exper- 
ience indicates that in a steady-state operation (with little or no devel- 
opment and experimentation) it is at least 70 percent, perhaps higher, for 



ERLC 



-62- 



coordination-related duties. These duties include, but are not limitec3 
tO: marketing and publicity; relations with users, Specialists, library 
staff and administration, NASIC at MEBHE, and the profession; review and 
dissemination of updated information to Specialists* The remainder of 
the Coordinator's time can be devoted to performing Information Specialist 
functions* Thus, the time assumptions underlying the administrative sur- 
charge in the pricing structure need to be revised in light of our initial 
e:<perience in order to recover costs* 

Price stability is a matter of some concern* The advantages to 
planning and marketing of a relatively stable price schedule should be 
obvious. At M-I-T,, the price structure reflects elements of expense. 
When the expense rates change, it is in keeping with cost recovery to 
eventually pass along these changes in the cost to the user, V/hen to 
pass along the cost is another matter. For those expenses under H*I*T» 
control, such as administrative expenses or Specialist charges, a periodic 
review using historical data for the past period can be implemented* An 
annual review, perhaps coincident with the end of the academic year, is 
probably sufficient* Other expenses not under M*I,T* control, such as com- 
puter or comniunication expenses, present difficulties because (1) they 
represent the bulk of user charges, (2) they typically are subject to 
change on 30 day written notice, and (3) these expenses reflect a direct 
outflow of funds from M»I»T» (or any other institution)* Because of the 
size and nature of this expense, it is one that should be passed on to users 
as of the date the new expense rate becomes effective* However, a short 
ac3vance notice period from suppliers is insufficient time to prepare, produce, 
announce ^ and disseminate new price lists to users in advance of the change 
date* The effects of a short notice on marketing are unwel^rome* In an aca- 
demic market, a stable period of at least a semester is highly desirable 
although a full year is even more preferable* MASIC/Central at NEBHE^ on t^eiiaii 
the institutions it represents, needs to negotiate with systems for a period 
of advance notice of price changes that would encourage price stability 
within academic calendar time frames* 

The remarks in the paragraph above on price stability are based upon 
experience with a month notice by SDC of a price change (it was their first 
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change in over a year) for the connect hour rates of some data bases and 
on a change in the basis of printout costs (from a per-page cost to a per- 
citation cost). By way of contrast, Georgia issues ^ price list effective 
for a 12 month period and will he nor current rates for specific continu- 
ing searches that extend beyond the price-year, on 1 February 1974 we 
did change our price rates to reflect the SDC changes* During the few 
weeks preceding the change, inquirers were verbally told about the forth- 
coming change. They were even told that if they made an appointment 
before 1 February, it would cost them less. There was no discernible 
effect of the price change on either data base usage or on printout 
requests. This is probably because we were still too new for many 
potential users to have been fully aware of the service, let alone the * 
older rates* 

The reader may be interested in the revenues that were generated in 
the two and a half month period since operations began. Table 9 summarizes 
the revenue by major categories for computer and administrative charges. 
Information Specialist charges, printout charges, and other charges, Asso-- 
ciated with these categories are allowances against the charges to users for 
problems, technical and non-technical, that arose during an appointment* 
The largest allowance category is ^or the time of the Information Specialist 
since most users are eligible for the introductory credit. After all allowances, 
the total net revenue generated for this initial period is $1095,05 for WASIC ser- 
vices and $494-08 for medlINE services. In phase 2, we expect to be able 
to relate revenue to expenses. 

Appendix B contains most of the foxms that we have been using during 
our initial operations. These forms were described under Task 6f, A few 
of the forms have undergone modification since they were implemented and 
some still need to be revised but none of the changes are major* one 
initial form for an inquiry follow-up has since been eliminated because 
cortmon techniques such as standard pads or quick notes or phone calls 
are more viable. The forms are used at different times by different 
people and for different functions. Sometimes the information entered onto 
the forms is less than coirplete but for the most Part, everyone recognizes 
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Table 9 



NASIC AT MIT 
REVENUE SUMMARY DATA 
15 NOVEMBER 1973 TO 26 FEBRUARY 1974 



REVENUE SUMMARY: 



NASIC 



COMPUTER AND ADMINISTRATIVE CHARGES $989,02 
LESS; MACHINE CONNECTION PROBLEM 

ALLOWANCE i ( 23,76 ) 

NET COMPUTER AND ADMINISTRATIVE CHARGES 



INFORMATION SPECIALIST CHARGES: 
LESS; INTRODUCTORY CREDIT: 
LESS; OTHER SPECIALIST TIME 
ALLOWANCES : 

NET SPECIALIST TIME CHARGES; 

PRINTOUT CflARGES: 

OTHER CHARGES? 

LESS: OTHER ALLOWANCES: 
NET OTHER CHARGES: 

TOTAL NET REVENUE 



$267,62 
( 220.55) 

( 12,00) 



( - 



$965,26 



$ 35.07 
$ 94,72 



$1095,05 



MEDLINE 
$388,30 
( il,70) 



$261.14 
( 239,56) 



) 



( 2.80) 



$376.60 



$ 21.58 
$ 98,70 

($ 2,80) 



$494,08 
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the importance of the documentation and its role in characterizing , analys- 
ing, and understanding the service operations and using this information 
as a basis for further experimentation, testing, or improvement- 

When the problem statement that users are asked to fill out was drawn 
up, we had misgivings about its length* Our fear has proven to be unfounded* 
User receptivity of the statement has been gratifying » The statement has 
been particularly beneficial to the Specialist in being able to talk with 
the user in his own terms* The statement is also beneficial to the user 
in getting him +^0 thii*k more about his own problem and its boundary condi- 
tions in advance of the search so that the Specialist and user can optimize 
their interaction. In one instance, a potential user was attempting to 
provide a narrative of his problem but as a result he realised that he did 

not understand his problem* He stated that to go ahead with a search at 
that time would be a waste of his own time and money but that he would 
do a search after he thoughtmore about what he was after* This phenome- 
non is not new to reference librarians but when users pay directly for 
services, then knowing what you are doing takes on added importance. 
Users are told that the efficiency of their appointment can be increased 
and that the cost of services to them can perhaps be lessened if the state- 
ment is filled out prior to their appointment. Users have filled out the 
statements in varying degrees of completeness depending upon just how much 
information they already have at their command* Only two users have raised 
any objection to the statement, and this only because of misunderstanding 
its purpose. If a user does not fill out the statement, and he is not 
required to, the necessary information is still gathered by the Specialist 
during a more extensive reference interview at the tima of the appointment. 

Interactive retrieval systems are dynamic and change with modifica- 
tions over time* There are occasion^tl operational difficulties, both 
technical hardware and logical software i->rcblenis, and there are also data 
base content problems- Many problems make themselves known at the inter- 
face between searcher and system* Some difficulties are not so much a 
problem as they are a system, procedure in need of either improvement, or 
further exposition. We have been in frequent contact with the SDC Search 
Service Staff ir order to resolve whatever issues arise with respect to 
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ORBIT and the available data bases. It is a pleasure to note that the SDC 
staff has always been attentive, responsive, and most cordial to our in- 
quiries. We have had occasion to offer a number of suggestions to SDC on 
improvements to their system, while we make no claim to either the orig- 
inality or uniqueness of such suggestions we do note that SDC has acted on 
many of them. This is highly indicative of the kind of results that a cen- 
tral NASIC organization could help bring about even more effectively because it 
would represent a large user group. For items in need of clarification, 
a central NASIC organization could effectively disseminate responses and 
examples or solutions that are more extensive than the information typically 
appearing in retrieval system newsletters. We are already doing thi5^ 
kind of dissemination at m.I.T, from the Laboratory to the Coordinator's 
Office to the Specialists. There is a continuing need to generate mater- 
ials that supplement the information available directly from a retrieval 
svFtem. 

i'ne user receptivity of NASIC services is of considerable interest 
but we can give only an informal report at this time. In Phase 2, we 
will undertake a more formal survey of users (and non-users as well) to obtain 
personal reactions and to have a basis for changes to the operation. We have, 
however, received enough informal and unsolicited remarks bv uners to knew that 
they do like and respond positively to the in depth, customized service and 
personal attention given to their bibliographic needs. 

One of our earliest users ran three searches ^ one on CHEMCON and 
two on ItEDLIWE, to complete his background research. He said he was 
satisfied, that he had obtained very relevant information, and that peri- 
pheral material that at first did not seem pertinent had been found to be 
useful to other meinbers of his research group. 

A CHEPCON user, an older man and a faculty member, had been a frequent 
user of Chemical Abstracts in printed form. He related that he had been 
extremely pleased with his search results and he also said, "Frankly, I'm 
just getting too old to wade through all that stuff — I'd rather let the 
machine do it — it's really something*" His particular r.earch came to $57 
and he said it was worth every cent. 

Much of the user response is seen at the Coordinator's Office when 
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people stop by to pick up their off-line printouts. They usually take a 
few minutes to review it and have universally seemed pleased with thfeir 
results. 

A graduate student from Harvard stopped by for the printouts from 
combined ERIC and MEDLINE s^'ctxches on information gathering by eye move- 
ments. She said, "I could do this every day," The results were to be 
shared with her research group who were working on a proposal. Her 
search is also indicative of the multidisciplinary nature of many search 
problems. In this case, the topic was approached from both its physio- 
logical aspects and its psychological or reading aspects. 

Our initial efforts in integrating computerized reference service 
with traditional service can also be illustrated. We saw earlier frcm 
Table 6 that a number of users have been referred to naSIC by the libr.ary 
reference staff. In turn, nASIC users are also referred to traditional 
sources by the Specialists who look to see how the user can best be served, 
A student who made an appointment to use INFORM opted for a traditional 
search after the Specialist told him of several printed indexes which weif^ 
more relevant for his particular problem than the machine stored data base 
he had chosen. Another user, an administrative officer, who did search 
INFORI^ was also told by the Specialist of two printed materials of which 
he was previously unaware, one of the printed items being Business Period- 
icals Index , This individual not only also used the printed sources, 
but he has also returned to do a further search both by computer and in 
printed sources. 

Thus, NASXC and traditional library services are complementing one 
another. The emphasis is on understanding and solving the user's problem 
using the most appropriate sources and thereJDy gaining the fullest utility 
of information resources regardless of format or media. 
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Analysis for p^uture HAijlC Systems (Tasik 13 ) 

M. I.T. staff have held a discussion with staff members; t^rorn QE7i Ti:c. 
who wereperforming a study of evaluative procedures that will be useful 
for NA.SIC. This discussion included consideration of criteria for com- 
puter systems. QEI recommendations may be found in a separate repoxt 
submitted by QEI to NEBHE . 

M.I.T*, NERCOMP (New EnQland Reqional Comoutino prooram) , and 
NEBHE staff have met to discuss potential models of NASIC operations. 
In the area of NASIC services, these include a batch model, a bar.ch 
model with remote job entry input and/or remote printer outputs and 
an on-line model* Communications models between a NASIC Information 
Specialist and the user include personal contact, voice phone. Telex- 
like connections, mail, and lastly^ direct communication between user 
and supplier with tiASIC acting only as comptroller. The consensus at 
this time is that there should not be ^ny dedicated NASIC computer^ th&t 
NASIC should not build a network parallel to existing regional networks, 
but that NAilC should consider using existing computing facilities in 
the region , and that the ability to provide on-line services may be the 
primary future mode of operation* Model definition will be refined in 
t^hase 2 as additional information about existing facilities in the region 
and on actual experit^nce with the M.I.T* pilot operations is gathered. 
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Plans (Task 14 ) 

The discussion under many of tlie preceding tasks has indicated areas 
in need of further study, development, or testing* A formal plan for work 
on HASIC ^t M»I-T, during Phase 2 has been siabitiitted to the New England 
Board of Higher Education, 
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Appendix A. Project Personnel 
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Electronic Systems Laboratory 

Professor J, Francis Reintjes 
Mr, Alan R- Benenfeld 
Mr, Richard S, Marcus 
Mr, Jorge R, Peschiera 



The MIT Libraries 



Miss Natalie W- Nicholson 
Ms, Marjorie Chryasostomidis 
Mr, Edgar W, Davy 
Ms, Margaret E, DePopolo 
Mr, William J. Duggan 
Mrs- Patrica T, Gordon 
Ms, Irma Y, Johnson 
Mr, James M, Kyed 
Ms> Ann S, Longfellow 



Ms, Margaret A, Otto 
Ms. Mary E, Pensyl 
Mr, Phillip W, Piper 
Mr, Peter R, Scott 
Mrs, Jacqueline Stymfal 
Mrs, Frances B, B, Sumner 
Ms, Nancy G, Vaupel 
Ms, Susan E, Woodford 



Information Processing Services 
Mr, Robert H, Scott 
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Appendix B* Sample Forms Used in Operations 



Copies of the following forms are included: 



Figure 


Bl, 


Initial Service Schedule for Appointments 


Figure 


B2, 


Inquiry Data - General 






Xtl^^ ^JT CL X> Cl Lr'CO^^ Vtl 


Figure 


84. 


Inquiry Data - Special Questions 


Figure 


B5, 


Inquiry Post Card 


Figure 


B6, 


Appointment Reminder 


Figure 


87- 


User Problem Statement {3 pages) 


Figure 


B8. 


Work Order 


Figure 


B9. 


Appointment Log and Review Analysis 


Figure 


BIO, 


Information Specialist Weekly Time Allocation Sheet 


Figure 


Bll, 


Information Specialist On-'Line Connection Log 


Figure 


B12, 


Rate sheet (illustrating M*I,T* charge per minute for 



online connection using CA Condensates) 




1130-2- 



PAT (mtj) 



PAT tlA50) 




Mti^^jNC r^LOUNC 
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NASIC AT MIT 
INQUIRY pATA - GENERAL 

In-Person Phone Mail Date: 



Location: CO SE D H R S Other 

Repeat Caller: V N Repeat User: 

CALLER: Hane : 

Address : 



Duration : 



Agency Call; 



Page 1 of 



Time : 



Phone and Hours: 



Alternate Address/Phon^ : 
Dept. or Lab. or Course: 
Status: MIT Non-MIT Affil: 



(ILO P A) 



Faculty DSR staff Library Graduate student (D M) (tA RA) 

Tost-Doc. Adnin. Staff Undergraduate ( 1 2 3 4 5 ) 

Other position: 

Is Inquiry for Soni^one Else? 



FOR MCO USE OrJLY : 



Appoi ntirien t : Day ; 



Appt , for i 



TiiTie : 



Specialist : 



Location: CO BE d H R S other_ 
Spec;ial Sct-Ups; 



Consultation to Discuss Services Available: General 

Expected Search Services: Retrospective On-Line 

SDI Off -Line 

TBD TBD 



For Specific Problotn 
Data Base{s) 



Systein( s)_ 
TBD 



Othtir Sorviccst 



ExpC'Ctu^3 Payr.ciiL MeUiodi 



Brief Probic:n Tltlei 



MIT Chatgc (Acct ■ 



) 



Check 



Cash 



Hotx-MIT P.O. 



Other (including exp<?rinicntal} 
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Fig* B2 Inquiry Data - General 



Pag e of_^ 

KASIC AT MIT 
INQUIRY DATA DESCRIPTIOH 

C^l le r ; Dat« : 





Follow 

Up * 


Description (include summary, problems f actions required^ comments) 



























































































































1, Gorterol 13, 

3^ DatA Ba<^es/Hatc SySt, 14, 

5- Pricing 16, 

6, Speclilista 17, 

7. T«iinindls IS, 
Outputs la, 

9, Output PccviJ, 30, 

10> Delivery Swrvricta iU 
11* Service Jiours/Locdtlons 



Wwut Appts, 
Uocr Funds 
HequifiitLons 
User Worksheets 

Order 

Chanq« Ocd«r 
Cancel aorvicG 
BlLllnq 
CrtHli tti 



J2* payiMnt^ yi 

23, C&nPtectlon Probl^^nos J3 

24, Scorching rroljlcuva jfl 
Lquipannt treble 

^t^, Exp* ri stent:? 
17, Ccnnents 
re, Ccr>l*l4tnt5 
J9, Arranqo APPt, 

Chanijn APPt, 
31. CAficel A|ipti 



Appt* TtfTciinder 
Appt* ^chf>dullrig Problem 
ScJiodul^r Change 
50, User Topic 
80, NEBHE 

Outride Agency 
99, othic 
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Fig, B3 Inquiry Data - Description 
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Page of 

WASIC AT WIT 
INQUIRE DATA - SPECIAL QUESTIONS 



Call^jri Date: 



I- Several methods of anr^ouncing ^^ASIC have been used- How did you specifically learn 
of NASIC? 

Brochure . Tech Talk Library Bulletin 

Letter Poster Saw a Site 



Colleague Meeting Saw a Session 



Other 



2-4* If no appointtt^ent has been ciade, and if it is not obvious from previous datai 

2* Ca;^ you tell us the reason you do not wish to arrange for an appointment at this 
tin^e? Your answer may help us improve upon our services* 



3, What is the suhject area of interest to you? 



4, Do you have cicc;osF> to art MIT chctrge account? 

(if yesJ v:ho would nocd to approve a requisition against that account? - 
(If no) ArcT ot:ner f'indirg sources available to you? 
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Hoom 10-400 NASIC AT MIT 253^7746 

Request for Information 
Brochures about NASiC search services are available at the 
ref^vence desk at each MIT library- For furthei: information 
about NAsrC services, kindly phone 253-7746 or stop by the 
central fjASIC Coordination office, Room 10-400, between 9 and 
5, Monday thru Friday- if you prefer, let us know when and 
where we may contact you- please leave this card with a 
Library staff member or put it into the institute mail - 
ThanJc you* 

Mary pensyl, NASIC Coordinator 

Name: 
Addres£>: 
Phone (s) : 

Hours you May te Reached: 
Nature of inquiry : 



TO: MARY PENSYL 

NASIC COORDINATION OFFICE 

ROOM 10^400 

M-I*T- 



Fig- B5 Inquiry Post Card 
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r 

MASi>ACHL;SinTS IN'STlTUTi: Of tfXllNOLOGV CAM8Rto<:£. MASSACuvsEtrs «.Mi^^ 
Room 10-^00 KIT LIBRARIES 253-7746 

NASICATKIT 



AppOIr:T^2;^lT reminder 
You tiaVtJ <in appointment 

with 
at 
on 
at 

Please be proiT;pt. If you arc? charging NASIC services to an MIT t^ccount be 
sure to bring an asjthorized requisition slip with you- If you must change 
your appointiT;cnt, please call the NASIC coordinator's Office, 253-7746. 

You can help irciease* the efficit-ricy of your appointment with the 
Information Specialist and perhaps lower the. cost of services to you by 
careTuXJy filling in the attached User Problem Statement before your 
appointment- An initial search strategy will be developed by the Specialist 
together with you sind it will be based upon your replies- The initial . 
strategy may be modified by yoj and the Specialist as search results are 
received and revlowod- Kindly prosont these forms to the Information 
Speciallat at the sttirt of your appointment- 
Thank you. 




Kary Ponsyl 
NASIC Coordinator 



Fig< B6 Appointment Reminder 
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NASIC AT MIT 
USER PROBLEM STATEMEtTT 
Hamet Appt* Datet 

completing this form prior to your appointment will inci^ease the efficiency of tJASIC service 
during your appointment and will likely lower the cost of service to you* 

1, Please give in your own words a narrative description of the problem to be searched* Be 

specifiCip Define phrases with special meaning^ Cover all aspects of the problem but please 
underline particular phrases that are more important to you* Append a list to your narrative 
of any synonyms, closely-related phrases, and alternate spellings. Please indicate if any 
words or phrasos have a special use that you wish to excludle. Use scientific and technical 
as well as common vocabulary. 



Page 1 
tf*0,N* 
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User Problem Statement contd* Pago 2. 



Najne: 



Unlet;s already stated/ please indicate any models/ end uses/ or applications of your 
problem that could helpful in retrieving useful references* 



Plot^sti stdto any topics related to (or applications of/ or views otf or approaches to) your 
specific problh!i;\ thiit are not of interest if you wish to exclude retrieving references to 
any documents on such topics* 



Please give a title to your problem- 



5- Please list two or three of th£> most important authors fand/or organizations) publishing on 

your topic. Complete names a*e helpful. Please indicate if you vish to exclude documents; 

by any of these tor other) authors or organizations because of prior familiarity with their 
PublicatloHLS , 



6, Please list two or three of the most important journals coverir.g your problem. Please 

indicate if you wish to retrieve references to documents fron; ojily^ these journals- Please 
indicate if you wish not to retrieve references to documents from any particular journal, 
perhaps because you personally receive the journal- 



7- Do you wich olther to retrieve or tiot retri^^ve references to documents written in a fortic- 

ular language? Doos not matter Retrieve English only 

Rf.tcif.ve only in Do not rotrievi.^ in , 

S< tx> you wish to fxcl>.]:! c referenct^ji to particular types of docuncnts? 

E^xcludtj Journal flrticlr»s Books l-'a tents Ropxjrt?:^ 

^Conference Papers D igaortatlons 



ERIC 
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User Problem statement contd. Page 3. 
. Name: 



9» Please list the complete citations to two or three of the most useful articles on your 
search topic, (it may be helpful to bring these articles with you to your appointment.) 



10. would you prefer 

a comprehensive search that retrieves most of the references relevant to your problem, 

but which may also retrieve many references not relevant to your problem? 

a narrow search that may retrieve fewer references relevant to your problem^ but 

which also retrieves fewer non-relevant references? 

11» Can you estimate the number of relevant documents 

a) you think may be present in the literature 

b) you would like to retrieve and get references for _ 

12» If you have previously done a literature search (manually or by computer) on this problem 
or a closely related problem, please indicate if possible what was searched* what 
difficulties were encounteredr and the overall result of the search. 



ERslC— 



Fig, B7 continued. 
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NASIC at MIT 
WORK 0RDE:R - PART 1 - SUMMARY 



Special ist 



Work Order No. 



Service Date 
NASIC Account 



Name 



Address 



Phone 



Bill To (if different) 



n WIT Ke<::uisiticn NO. 
O P^jrctiaso Order ^^o. 

O Cash Rocoipc Xo. 

D Chock 



Dated 
Dated 



User Account Ho, 



,766 (NASIC) 



,789 cr^:D) 



Dated 



ID 



MIT Personal Chttt^jc 



Amount Paid $ * 
Amount Paid $ 



Account 11305. 155 
Account 11305.155 



Q£;r:ploy€0 Q Student MIT ID 



r^El^VICE^S in. .:rx Charge ar;[:lies} : ^ I ndusirinl Rates Apply! 1 



*Fietrospertivc Searches; (System; Data Base, Offline Coverage) 



Service 
Rate 



Units 
Used 



Cost 



^''b ject 
Code 



♦Partial Voli::^.o Rotro. Search (Sysiterrt, Data Base, ^^o. Issues) 



Current Awairenfiss (System, Data Base, Issues) 



♦Specialist rvOi.vic*.s C Descriptiopi, TimoJ 
Srif^ci ciir p t r.rricir.tnv^ent Ti^ir* 



Outjjut: CoE^s (Dt'ScribeJ 



159 



161 



♦Document Services (Describe) 



Other SGrvicrs (Describe) 



Consul tat iCTi r.},our. ;;orv: cos (-lo Charrfe) 



Subtotal 



Credit:^ f.i j^' 



Ti rrr 



ir>3 
173 



Corrr.^r/' r Pre:, lo:" A 1 ^:^v.^^nc^ 



170 



C'RI^DITS SUiiTOTAL 



TOTAIx CKAl^'ii: 



BalLtnce DLf^ 



-J 



□ 



OkcOf'J*: Tot til CUhU^J':*. 



rj:Kv::u of 



is due ustr 



Th.inf; you for iir-in^ :/A-viC AT MIT. C^'ill 253-77'JC should you have ftjrthor qucstionfi. 
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Fig* b8 Work Order 
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NASIC AT MIT 
APPOIHTMEKT LOG AND REVIEW ANALYSIS 

Specialist: Date: 

Usert Work Order: 

Please record running notes of problems and important decisions during an appointment* Later^ 
complete the notes x^ith a more detailed commentary and analysis. In particular* note (1) tech- 
nical problems with connections or tenninals (e.g. nature* time, duration* attempts to solve); 
{2) search software problems {e.g. nature, solutions)! (3) search strategy and performance pro- 
blems (e.g. nature* development, effectiveness); C4) user interface behavior; (5) user comiaen- 
tary. For connection and software problems, cittach if possible the relevant sections of print- 
outs. 



Page of 



Technical Problem 


Notes, Descriptions, Commentary, Analyses 


Time 


Duration 
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Fig, e9 Appoiritmtiut Loy aiia Review Analysis 



IKFORMATION SPECIALIST TlilE ALLOCATION SHEET 



Specialist 
Week Ending 



;VCTIVITV 


KASZC SERVICES 


tIEDLirjE SERVICES 


Training ar*d Practi ce 




Vi 


T 


ki 


I 


r 








t 




T 


t 


TUIAL 






























2, Offline Search Preparation 






























3. Or.line Search Sessions 






























Cperatior.s and Services 
4 . >'eetin9S 






























5. Cffline Preparation 






























6. Online ?repar3tion 






























7- Ap^oint*7ier.ts Not At Terirdnal 






























S- Appoint^T.ents At Terminal 






























9* Delegated Online Searches 






























10- coord. Office Interactions 






























Genejral 
11. Study 






























12, Doctrncntation 






























13, Travel 






























14, Oth'^T 






























TOTAL ex?e:;ded time 






























15. Extra-curricjiar Tii^ie 






























CHARCIABLE TOTAL 






























Terminal Connect Ti:?.c 

16. Training and Practice 






























17. Operations and Services 






























cc:::;=:cT timk t'^tal 
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1 



ON-LINE CONHECTION LOG 
INFORMRTIOH SPECIALIST KOMTH 191 





V ^ C n alQCZ w L. 

Train or Demo 






(add bold- 
er if ncc) 


Connect Time 
(minutes) 


Citations 


Indi 
Kate 


Co n n ^ t i on 
Problem 


Allowance 
(minutes) 



























































































































































































— - — 
















^ 




















































































































_ 




















































































































































































































































— 
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ca/co:i]i1::jgaths 



ACAn:':tic KATi-^i? p}-:n Mumi: 



HOURLY RATE: f; 67 





CHARGE 


HIN 


CHARGE 


1 


t 1,12 


51 


S 56*95 


2 


^* 2 3 


52 


58*07 




3* 35 


,53 


59* 1 a 




7 


5** 


60*30 




5 -5fl 


55 


61* f* 2 


b 


6, 7 0 


56 


62*5 3 


7 


7,82 


57 


63*65 


6 


6,9 3 


58 


6*** 77 




10*05 


59 


65* 6B 


1 0 


11*17 


60 


67* 00 


11 


12*2B 


61 


6^^.12 


1 2 


13*^0 


62 


69* 2 3 


1 3 


1^* 52 


63 


70 * 35 


1** 


15*63 


6^ 


71 * 7 


1 5 


1 6* 75 


65 


72* 58 


1 6 


17*67 


66 


73*70 


1 / 


1 A* 9d 


67 


7***fl2 


1 A 


20*10 


66 


75*93 


19 


21* 22 


69 


77*05 


2 0 


22* 33 


70 


75*17 


21 


23***5 


71 


79*26 


22 


2** * 5 7 


72 


6 0 * f * 0 


?3 


25* 68 


73 


91*52 


2** 


26* ^ 0 


7U 


92*63 


25 


2 7t 92 


75 


8 3*75 


25 


29* 0 3 


76 


8** * 8 7 


2 7 


30* 1 5 


77 


55* 96 


2a 


31* 27 


76 


67*10 


29 


32*38 


79 


66*22 


3 0 


33* 50 


fl 0 


6 9* 33 


31 


3***62 


Rl 


9Q***5 
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Fig* B12 Rate Sheet (illustratincr MIT 
O charge per minute for on-line connection using CA Condensates) 
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Appendix Sample Publicity Brochures 



Copies of the following brochures are included: 



Figure CI* NASIC at MIT 

Figure C2* NASIC/CA CONDENSATES 

Figure C3, NASIC/ERIC 

Figure C4, NASIC/INPORM 

Figure C5, MEDLINE 



- General Brochure 

- Services for chemistry and 

chemical engineering 

^ Services for education ^ lin- 
guistics and information 
science 

- Services for business management 
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miC AT HIT 

Automated BtBUtoGRAPHiCAU Services 

For Research 

new wit service 

A new service of the HIT Libraries will be 
available ynder the auspices of HASIC, on a 
fee- for-ser\^i ce basis* at five divisions: 
libraries : 

Science Dewey 
Humanities Rotch 
Barker Engineering 

Monday through rrfdayi on an appointment basis 
beginning Movers ber 15^ 1973. 



on an experimental 
data bases In the 



*he WASIC scrvico will open 
basis and Provide access to 
following fields: 

Chemiscry I Chemical Engineering 
Education! LiofjuisticSi Infor^iation Sciences 
Business * Manogejrciit* Economics., ^^arketing 
Hedi.cine^ Biology S Related Sciences 

Both on-line and off-Hnc access to the 
several data bases viiU be offered. Ati on- 
line search Cdn produce a printed list of 
references tnat you can take with you. Full 
bibliographies can also be printed off-line 
and sent by mail. In s ome fields you can also 
be aierted to new publications as they appeari 
on a regular basis. 

An fnfornnation Specialist will be available 
at each loca»1on to explain the system and to 
show you how to find rr*ccnt puolications 
relevant to your research interest. 

For Inforration about types of services avail- 
able and associated costs, ar.d to arrange for 
an appojnttncnt with ^tn Information Specialists 
contact the SASIC Coordinator's office: 



255-77^16 
Room 10-^100 

•NOKTHEAST AcADEMiC SCIENCE iNFOWriATION CENTER 

A ?rofjram of the ripw Cngland Board of Kigner 
Education, nASIC it ^-up^Jortcd by the 
National Science Tourdation under 
Grant :io. GN37296. 



DATA BASES AVAtL Ad U 

Oata bases for the ftelds listed belo^^ are 
ready now. 



Chcml s try j; Cn 1 ca 1 Inoi peering 

CH£MCO.'K for Cn^n^fca? Ats tracts Condensates, 
derives f pon Cf^^T-liiJ- '^^.^ AJCi^l^l ' spon- 
sored 5y tse ^^■ier"iC(*rt J^rrcr'rc'dl Society, ar^d 
has tne sdr^L* covv*ra?y: adcut 6,000 articles 
selected from iO,000 journals are added e^ich 
week. The Of.-Unc file 90CS back to 19?3, the 
off-1 ine file to 1963. 



tducdti On t I innui st 1_cs_, Information Scisn c&j^ 

The £RIC (Educational t^'isfiurcej Inf or-id t i on 
Center} data base is n<iirta1nCJ 5y tnc u,S. 
Office of Education. £ach nonth about 1,000 
new reporis ancj 7 ,SlJC new journal articles 
selected frop; over 500 jo'jrnals are added to 
the on-line jnd off-line files, whi:h yo back 
to 1966. 



Business, Manaqenej:^ , EconOriics^ Harkctina 

The 'iNfJRM data base produced by A61 , Inc. 
is updated nontnly at a rate of atout ^"00 
articles selected froni 260 jojrntils for this 
On-line file, ^^^ilch f;ocs back to August 1971 



tiedlcinet Fi^ clogy K ? a ted Sci ence s 

I4E0LJ:*E, oi^eratcd by the :jationaI Library 
of Medicine* IntjCxes ine 1 ,Z00 leading 
journals in the no-.cdical fielr: since 1570. 
It covers aOoLt ^0^ of the i-accrial in lnde> 
Me di cus t witr. ^toi^t U'^DOO <irticlcs beinn 
added each nonth to this on-line file. 



The conitJletc NAGTC inforratiun service now 
being plar.i^L'd wUl evtintuaMy induce dfltis for 
all nijjor fields nf r.^s*>arcn interOsl at MiT. 
New data ha^ts wi U l-^^ ajderi within tnp next 
few n^ontSs to Ctjvcr interests in -^^ov e rnnf n t 
reseiivcn, onTii^oorim ^nd ohysics. Ml these 
d,ua tras ei .^i 1 1 b,-^ av.i 1 1 oM i^'-o : the I C 
re^;i ofia 1 ne t^.<n k i^i^ccpi vi \ \l t rtin cn is 
(ivafldt^le on-lir,c: St'^Ar^ate a rro n ioml^o t with 
the :utionaJ Library of ::odicine. 
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Fig, CI WASIC at MIT - General Brochure 
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fwo hroJtJ classes of service are being 
offered; lytrospectiVL* searching of data 
boscs, and a current awareness (alertinri) 
service* All uata ojies can tie searcned on- 
line, and either i cmoj late on*lini' printouts 
or delayed off-tme Printouts iro oossiDle. 
Most data bases c^n also be sejrcnea in a\t 
0 1 f - M ne mode * 

Tne on-line sejrc'^ nnje enables tne opera- 
tor to converse ^Jirectly *jitn the computer and 
to Obtain an ir-nt^dMte rOEiTonso to a ojcry in 
tne Corn of a printed Hs: 0' citations. Tnis 
interactive feature ridve^ it possible to use 
t/id syster in an e^plorat'iry w^y to iri^rovc 
tne effectiveness of t:itf seoi^c^. Tfie oiera- 
tor can ro,Mfy t^e ^jearch vir. rjs and Ji^juSt trie 
stratOf]y j^; tne searcn rrc^resses if> orjffr to 
acnieve j clo^^er rutcn wit-i trie nijeds Of tne 
researcher. Tnis ex:>Uratoiy c^cabiHty wUh 
tne atn of refinin.T tne definition of tne 
Di bl iograt n i c proolei.t is t^ne of the nost 
i mpor tan t f l a t Jr^ s of ttie s ys teri . 

If the lis: of citations is long and the need 
is not ir*'ie J i ate an ac^Mtional ootion pern its 
printing off-line dml delivery oy mail, at 
sy:)^tantial savings. 

Alsg availa^j^e for sone data basts is an off- 
line retrospective searcn service. This 
option may result lOk^tjr cost for exten- 
sive searcn and Printout requ i renon t s . 

A current awareness service ootion that will 
alert you at regular intervals to new 
publications ir. yOLjr field is available for 
several of the data bases. 

Search ria'^nitude can limited in various 
ways to a ;fartial data oase a^id t i , 
author, institution arid otner rec^ui remtnt s . 

USER ASSISTANCE 



The assistance provided by the Inforr^it 
Specialist i an essential part of tre 
service, tacn of tne inf;. r'^aiiion rttri 
s y t c n h a ^ its p e c u H oi r ^ p c r i ^ M 1 a n r u 
logic, rah;: of acceEi'*, ^^roce.ik^eSj 
kind of infr>r ration an^! fo-:^ of i " t O'i 
Tne 1 n f oTi' a 1 1 on Sr^.*c i a I i ^; t, is t t (mi 1 i a r 
all of tne j^,'ailat.le (Jttta tjas*?^. ''*is 
It nowl cd9<i u 1 ] 1 tje d.i ^ 1 1 c j 1 j r 1 y v i 1 u h I e 
f orr J 1 a t i n^J f^e-i rc-f s t r a to^ i e < i n cr^ r,K t 
with the s/jt^'"! and d-aliri ^ i th intcrd 
ciplinary inforriatton reii^e^ts. 



\ on 
new 
eva I 

1 C ' S 
w i t fi 

i n 

1 n-} 



Tne prinary talk of tfic irfon^ation Uit^J:- 
cialist to assist you wi translatmn ynur 
probleji ^ttiln*ient into tnr? lanluaie cf t^ie 
SyS ter in orJer t u nel;j y^j to r'^a i r. i /r; t^ie 
satisfaction ym derive rrom t^.e Eiy^te-^. and 
to f'inimiiic the cost of n;aving a s"arcri. 
This uif>r i n to rd c t i o n rijy t3ltC half an no-jr 
or more to develop an aiiprapriate searcn 
s t ra tegy . 



SP_OFfSf)RSli_lP 

The NAS[C computer-based b i h 1 i ngraPh i c service 
ts being developed by tne New tngland Board 
of Higher Education under a grant f ron tne 
National Science Foundation. The cxperi- 
mental Hit service is hoinn tested by the hit 
Libraries and Electronic Sy^teM Laboratory 
under a contract witn NASIC. The exber^rentai 
service at mIT will become the first node of a 
regional network of science infornation 
centers located at university libraries in the 
northeast region. Policies and Procedures for 
the NaSIC network v^ill oe based on experience 
gained at MIT during tnis experimental period. 

The rtEDLlNE systeri for the biomedical sciences 
is not a pa-t of the fiASIC service* but is 
made available at the same terninals at rtlT by 
arrangement with the National Library of 
Medi ci he . 

COST 

AEthough de ve 1 op'^eu t costs arc teinQ under- 
written by HSfi operating cost^ r.\u^i be 
recovered on a r ee- f or - se rvi ce bans. Fees 
vary with the a'-^o■Jnt of service provided hy 
the Information Scecialisti the data bd'-c 
searched and tne tine spent at the terminal. 

Since MEDLINE is Substantially SL;bsidi;ed 
by the National L^orary o^ ^ledicinei the cost 
of search inq this data base is less tnan 
search costs for tne others. 

The price structure includes a fee for the 
Specialist's tir. spent in developing;; search 
strategics witn you and for poeratTon toe 
systei^H Llccause tir.e soent at the ternmal 
is expensivfii jsers c^n ^;enerally rir.inice 
overall cos*i5 by takina advanta*:** of tre 
s}tills of the hi forma 1 1 un Specialist, T:iere 
is a Charge of li*0^/*\o:ir of Specialist's 
time wUn a ninirtUn':] cnarge of S^.O^.''^^ 

Further details of fees are included in 
separate brociurc: that descrit>c eacn data 
base. Typical eAunPles of con;iuter searcn 
costs are^ ffir a half fmur of tii.e spent 
at the torrinal searching: the business data 
base I;*F0R:: S33.S^» for a sirriilar search 
of t^tCOLlM -- i'-j.Qj, foi ihlZ -- Si:2.0o, 
for C^iE^^CO^ S^7»50. Tines and prices may 
be iPrfOr for tirrple pPOblCCjS or hf'ih^r tOr 
more co.^plex ^froirieni^ nr those that nave not 
been well defrnnO rntially hy the user. An 
off-line search o^ a one year collection of 

tRlC would l>o SrC.on. a current Awareness 
subscription lervice for tne Cheiistry data 
base would cost appro* 1 1 a tel y 37. vO for each 
week tlie service is rendered. Off-line 
printing LK.ircei are at tne rate of lOi 
per printed Pa^je. 

For further inforrrUion and to arraiir^e for 
i,n appc i " tr en t with iiri 1 ii f orr.a 1 1 on C;ic . 



ci al i St , cj I 1 . 



.3-77^6. 



(A)nurMiii the initial "hreaV-in'* periijU» 
users will receive a credit for the iTtfOrma- 
tion Specialist's time up to a maximum 
credit ^f i^jO.OO. This credit is good until 
the end of tlif' acadeiiic year* if^ June 1974. 
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Fig . CI continued . 




NASlCVCA-CONDEiJSATES 
Automated Bibliographic 
Services for Research 

Services for Chemistry and 
Chemical Ei^gemeeru^g 

the data ease 



CA-Condcnsates is the ctienMstry data base 
corresponding to tJie Publfcatlon Ch oriii ca 1 
Abs t r acis prOductfU by the Chf^nncal" 
rtbstraits Service of the American Cne^lcal 
Society. Trie worldwide data gathering 
capability of CAS provides conprericnsive 
coverage of the Hterature in all fields 
of criem istry aid dissernnates biblio- 
graphic information fn this literature in 
both printed and rjcn i ne - re adai; 1 e forn. 
The data base is issued on a weekW tidsis, 
eacn issue coverinti one half of the 
total subj**ct scooe of the data &ase, 
Searches tnay be tailored to Ir.G Odd or 
even numbered issues. 

^i ASlC/CA-COMDE:^SATES AT H^T 

Computer-based bibliographic services in- 
cTudino CA-Conden sa tes add a new dimen- 
sion *.o Information retrieval tradi- 
tionally perforned by nanual teclin i ou es . 
HASIC services, available throuoh the MTT 
Libraries* enable you to employ a more 
exfiaostivo combination of retrieval 
paraoeters at relatively low cost to pro- 
duce r^JLt^ and fij ^h ly re ? eyaji t sear ch^ 
res ul ts . 

USER_ AS S^STAMCC 

Tne C A-Condonsa tc s data bas*:- is accessible 
in interactive on-line or ren'^te batch 
modes . 'All I nf orr^at ion Specialists are 
available to assist you in the uso of 
the^ve services. For infornation about 
types of services available and associated 
costs, and to ar range for an appointment 
with an Infor'natinn Specialist, contact 
the f/A5lC Coordi nator ■ J office; 



253-7746 
Room 10-ijOO 

'HORTHEAST ACADEMIC SCIEfJCE 

iNFORMATiory Center 

ft Progrer^ r^f the 'lev< England Qoard of 
Higher £i.'uciition, f^ASJC fs sujiported 
by the liational Science Foundation 
under Grant NO, GU^7Z9G, 



ilASlC/CA-COUDEiJSATES 



Chemistry and Chemical Engineering re- 
lated topics are cohered as in Chen ^ica 1 
Abstracts > in five major sections': 



K Biochemistry Sections (CBAS) 

2, Organic Chemistry Sections (CAOS) 

3, Hacromolecul ar Sections (CAfli) 

4, Applied Chetnistry and Chenical 
Engineering Sections (CaAS) 

5, Physical and Analytical Chemistry 
Sections (CAPS) 

COVERAGE 

The CA-Condensates data base covers the 
chemistry -related literature puDlisned in 
over 12«000 journals as well as patents 
Issued by 26_ countries, :iew books, con- 
ference proc**edings , and government re- 
search reports are regularly monitored to 
select those documents pertinent to vhe 
chemical sciences. The on-line data base 
references information fron the several 
issues of Chemicjl Abstracts published 
since 1970^ The" of f-1 i ne data base, 
begins with Chemi cal Abstracts volume 
No. 69, first publ ished in July 1908, 

FtL£ SIZE Af^D UPDATIHG 

The On-line CA-Condensates data base Pres- 
ently contains records for over 1 JOO.OOO 
documents, while the off-line data base 
contait^s over 1*500,000 records. Approx- 
imately U,000 new records are adoed to 
the data buse each month. The on-line 
file is updated biweekly. The off-line 
file IS maintained in two parts: bio- 
chenjistfy and organic chemistry iri one 
part (corresponding to the odd -nur+.&e red 
i s sues of Che^i caT Abstracts} and the 
other three ^ec'tTons in a second part 
(even numberf^d issues of Che^i ca ) 
Abs t r acts } , Each part is updated sepa - 
rateTy on an alternating w€ek basis, 

RECORD,. COHTE.tiT 

The CA-Condensates data base includes the 
f ol 1 Owi ng i nf orr:.» t ion e J emon ts from the 
corresponding issues of Cher^i cal A b- 
stra cts : titloi of piipers', patents, 
reports; narnes and organization*?^ affilia- 
tion of authors and/or assignees; biblio- 
graphic ci^ationi, language of document 
at^<i subject indvxing. 
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Fig, C2 NASIC/CA*CONDENSATES Brochure 
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tIPSlC/i A-COiMATES 



AC CESS OPTIONS 

.BASIC'S cofnpu t'^r-Dased Dl & 1 i 09 r^PIH c data 
oases on CherJstry anti Cftei^fcal Encjfneer- 
ing Itteraturc are aviilahlc for search 
In on-line or off-line c-odes . 

SERVICES AVA[LAbL£ 

Current j^arcness and re t roj pe c t i vc search 
services tailored to your sciccffic 
1nter€^ts, are nov* available at the WIT 
Libraries. Tne ^'^L^Jl'^iL'^^^.^^'L^^^ 
provides routine o*rf\ od 1 c" 'no"0 1 7 ca"t"i on 0 
the tiost rec€nt Publications whicsi ':iatch 
the subsC'^itunfT rc^Oiircher'S interest 
prof 1 le . ReJ^ro sjicc t i vO Jear >e rv i , 
gen era I ly cover 1 fig "s 0 v 0 r^ 1 year b"t' PucT- 
licationi, a re also available on-line 
or off-line. For many of the Citations 
ObtainiiJ tbrOugn your SA%IC searc^>, you 
m^y oi^cain through the "UT Libraries a 
ohotocopy {or in sone cases, hara copy or 
a microfil.-n coPy) of the full text of 
the document. 

COST OF SERVICES 

Charijes to academic users will 6e based on 
the fol lowing rates 1 

Current Awareness S370 {annual Sub- 
scription, wcetly delivery) 

^^etrospact^ve Search 

On-line: $SS per connect hour at 
terminal; niniirtun charde SS. 

Off-linei S166 per year of data base 
se arched {^ ^ 

Infor-?atiOn Specialist assistaTKo; 
S3 per hour* (nininUN charge S5.^^' 

Off-line coiiii^/tor printouts '.^ te!n cents 
per t;aye (^xt; card stock also available 
two cents EJtich f^Ktra) 

^^^lal f -pr 1 ce only odd or even issues 
searched 

t&)Ourinr] the initial '"break-fn" Period^ 
users will receive a credit for tne In- 
fornatior Spe c i a v; t ' s ti:nt u? to a 
maxir>ur credit of $^0>00. This Offer 
exp< re"! June 1 9 74. 



MASK - A Rf CIOrf AL R f. SOURC E 

NASiC - The Hortheast Academic Science In- 
formation Center - is beim developed by 
the tiew England Board of Uigner Education 
(nEBhE) to provide the Northeast area wUh 
a ce/itraJ access po^ht to the natfon's 
growing and diverse infor [nation re sources 
<n computer-readable forn. Thfs JtiveJop- 
ment (s being aided by s:aff of toe Assoc- 
iation of Researcn Librorics, tne ifassa- 
chusetts Institute of TtKhnolo^y and by 
other organizations and conskJltarUs 
Hforkfng ynder suocontrict to Uili^H . 

By aggr^gatinq dott) oases anJ t^xistinq 
information services, Lr\S;C provides 
comprehensive ahd in-doiun services to 
usjrs. nASlC thus aids wi increasing 
the capabilities of the .Sorthuast's 
academic cortrun f ty . 

The increasing availability of coniputer- 
readable data bases nuw njjces it possible 
for R^D Personnel to ^^j-^iT- up with the 
proliferation of Profess lonaJ Journals 
and with the growing rt^cord of experi- 
menta? and statistical JaCd> C^o.^Puters 
permit searching of nun^irtds of thcusat^ds 
of references in i^q t^ne it would take 
a human restarcher to reaU one 

NASJC A T MI T 

To assist in rieetinq the i n f a r-a: i on nr?eds 
of the t^lT COV.[:iurM ty , 0 nuner 0^ COH.H^ut- 
eriied biblionr^^:;^lc *»ervices ire alroatly 
available for several sjb^iec:. JiscipMnes> 
Others will soon be iicioec and, eventually, 
all major fields of rese^^rcn interL*st will 
be covered. 

For further i nf orrr.a 1 1 on on all cornputer- 
based services available jt the *MT Lib- 
raries, contact tite -ifV^lC Coordinator's 
off ice: 



253-77^16 
Room lO-'iOtl 



^0 Grovf^ '^trt^et 
We He s U'y , Mas sachu*;*? t tr- OT ^ 1 
(617) 23b-?.0n 
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NASICVERIC 
Ajtohated Bibliographic 
Services for Research 

Se^JViCEs FOR Educatiow^ Linguistics 

AND If^FORMATION SCtENC£ 

The data base 

ERtC (Educational Resources Inforniation 
Center) is the ed u ca t ^ o.-ial data base de- 
veloped and naintained by the lj,S. Office 
of Education* Eighteen c T ea r i n(])iOu ses 
located throughout the United States, and 
now reporting to the tutii>na1 Institute of 
Education, collects screen^ indexi and ab- 
stract the report and periodical litera- 
ture in education and education-related 
fields , 

NASIC/£RIC AT MIT 



Computer-based bibliographic services in- 
cluding ERIC add a new dimension Lo Infor- 
mation retrieval tracitionally per foriTied 
by manua 1 techniques* JiASlC services^ 
available through tnc HIT Librarie?;, en- 
able you to er^ploy an exnau^tive coribina- 
tion of retrieval paraneters at rf*i*itively 
low cost to produco ra p i d and h i (?h \y 
re 1 evflr>t searcri re sli 1 ts . 

aSER AG5^STA:fCr 

The ERIC data base is accessible in inter- 
active on-line or remote batch modes* MIT 
Infer ma tfon Specialists are available to 
assist you in tne use of these servicesi 
For information about types of services 
avail at^le anil associated colisi and to 
arrange for an appofr.tiJient witn at> Infer- 
ruation Specialists contact the NASIC 
Coordinator's office: 



253-77^16 
Roo;-i 10"'I00 

*HORT^;EAST ACAi:^EMIC SClUNCE 

Information Center 

A Progran of the Lew Encjland f;oard of 
JHgher Education^ HASIC is suf^POrteu 
by the fiational Science Tnunention 
Uftder Grant No* (^:i272^t. 



tiftSK/ERIC 



SUSPECT AftEAS 

Education and educa 1 1 on- rel ated topics in 
ERIC include: 

Adult Education 

Counseling i Personnel Services 
Di s adv an taged 
Early C^nlUhood Education 
Educational Management 
Educat i onal Med ia A 7echno logy 
Exceptional CJii Idren 
higher Education 
Junior Col 1 e9es 
Languages i L ingui st i cs 
Library i Infcr(^ation Sciences 
Reading i Conmun i ca ; i on Skills 
Rural Education i Small Schools 
Science, Mathematics & Environmental 
Educat i on 

Social Studies/Social Science Education 
Teacher Education 
tests, Weasurement & Evaluation 
Vocational i Technical Education. 

CQVEP.AGE 

The ERIC data base covers educational lit- 
erature published since 1969* and contains 
all citations put^ished in Research in 
E ducatio n (IHE) and Curronr/ ln^ .tJ_tOL 
:)oTirnair ^in Education (CIji). tne two 
major printed tnontn ly prcducts of the ERIC 
sy stetHv 

FTLF SIZE A'jQ gPPATlHG 

The ERIC file currently contains rc*cords 
for over 1 35 .000 doCLJcents, Approximately 
lOOO new reports aiid ISOO new journal 
articles selected from over 500 journals 
are ddded ronchly into tne ERIC file, 

RECORD CONTENT 

The ERIC record iricludes the following in- 
formation for each docuTont; the titloi 
author nein^.cv^) and 'or ^ani r a ti ona 1 a f f i 1 - 
iation, the pub 1 i c L i on cttaticn (when 
where published], and availalMlity 
(including price for i^ioroflche 
copy from tRlOi s^E'joct 
sponsoring ^oon', witn 
numher. RJL a»so f.a^ abstract 



and 

data 

or paper 
i nde x i ng and 
contract or irant 
for al \ 



pr t iiiary doL^ifiLMi ts . j t. cjr c n i nfj 
on any i te^ti of ioforr^Hion in 



is possible 
tne record. 



*A linited nu^i^er of riacumencs going back 
to 19^C> is ^Iso included* 
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ACCESS OPTIO ^IS 

<N ASIC's coriputer based bibliographic data 
bases on Eiducattonal Literature are avail* 
able for learch in on-line and off-line 
modes . 

5ERvICE: S AVAjLA bt i 

Current dwaroness, and retro fpective 
search services tailored lo y^ur specific 
interests, are row jvdllaLl^ Jt t^e ^^T 

Libraries* The C/j r re n : ^^wjrt^'M'^j 5ery_i_c^^> 

provides routuit. reTToti \ C "nVt i t" C4t i o~n of 
the most ri^Cf^fit cl^l ^ i c J t J oris whicn njtcji 
the subsCr^Li'i'j roscarcru^r'^ interest 
prof i }q . r^.k^y:^\.^.SL^.^yJ^_J'9S^^'!^ S -;^rv i cej , 
gene ra 1 1 v ^o Vt.^ r : 'l Ve'vo rtf 1 '} { in oV 
publications, Jre also avail jole on-line 
and off-lir.e. F^or many of the citations 
obtalneJ thrOufj^i ycy ^lASfi!; itarc^, you 
ir\&y oDtain througn the nir Libraries ^ 
Photocopy (or fn some cases, harc :opy or 
m^Crofiln copy) of tfie full text of the 
docuv:ient H 

COST OF SERVIC ^L 

Charges to acaotmtc u:ers wfll be based on 
the fol lowing rates : 

Current Awa renoss: Se5fA) (annual 
sirbscriptior, , quarterly delivery} 

Retrospective Search 

On-lirc; ^AC- per conntci hour at 
tern i MS 1 i "li nu..un chr. rgo 

Off-line: i7C per year of data base 
set^'CheJ iA,a) 

[hforriation Sf^edaUst (jssistance: 
Sij per hour; ninin^u-^ chdr^e Si^^f^^ 

Off-lino ^Cun;^uter Printouts ? t'Jn cent: 
per pa:)L> (4x6 card '^tock also avanatJe 
0 two centi> ea:n extra) 



^A)Half cn.Tr:ie if Only t>JE or CUE 
searched 

^^^Pll iearcri f :r iz treated as 

nne yea r 0 S jd . 00 

^^Ujurinf] t'nf^ iriltial br<: aV. - in" period* 
u^ers will rffceiv; a credit for tne In- 
fo rc^a t i on c i .1 1 1 t ' s tire up to a 
rMjttnur: crrC\i of 1^0.00. This Offer 
empire*; June H , 



HASK - A _ R£G lO:iAl RESOURCI 

NASIC - The northeast Academic Science In- 
fo rmation Center - is be inn developed by 
the New England i>ojriJ of Higher Lducation 
(^^EQKE) to Provioe tne :fortneast area with 
a central accesrt ue^nt to the nation's 
growing and diverge >nf oriia t ion resources 
in computer-r:aJjblc rar^'*^ This develop- 
ment ts tjeing aiceJ ^/ staff of the Assoc* 
iation of Rcsejrct Libraries, tnc iiassa- 
Chusetts Institute of Tecnno^of^y and by 
Other orgarizations and c0rsul:ants 
working under s Jbcon t r<jc t to fti:d*i£ . 

By aggregatina dJto bases and ej(isting 
i n f 0 rn a 1 1 0 n s c i v i c s » r* A S i C c r o v i d e s 
cotnprenenciv.i and in-deoth services to 
userSn NASIC tfiu: *;ids in increasing 
the capabilities of the J^orthLast's 
acadei:; ic connun i ty . 

The increasing availability of cotnputer- 
readable data bases now nai^es it possible 
for RiO peTSonnf*! to veec up nitn t?ie 
proliferatsor of professional Journals 
and with the orowing rt^cord of Oxperi- 
mental and statistical Onitd. Coji^puiers 
perr.nt searching of njrjdrcds of thousands 
of references in th^ ti.':it it would take 
a hu[ti an researcher to read one page* 

HASIC AT r:IT 

To assist in nceting the infcrration noedi 
of the F'i I T c I'' ";jn i ty , j hi^Hr^ber of co^rij-ni : - 
eriied hiblir^rauMic strvices are j1 ready 
available for severcl Sutjject disciplines. 
Others will sosn oe aJded and, eventually, 
all pnajor fielris of research interest will 
be coverean 

For further information on all CcrrpJter- 
based servic^'^j avai^(UJIo thti yAl Lib* 
raries* contact the NASIC Coo ruinator's 
office; 

25i3-77i|G 



THE MEW EMG!//^[f fCAHO 01 hKiMf.H CUl'CATK^ri 

^0 Grovi> Stn -t 
i,eni nif^y, ^^lJS^ MU'i H.Li d/] 1 
£617) 2:i'j-uun 
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[MSICVIIIFOPJI 

Automated Biiiliographic 
Services for Research 



ScRviccs voR Business i'Ianagemeht 



OATA BASE 

KjFDrH is 3 business nMnaf^et^ont orientci! 
data base prod^;ce',l by ^^^^tritteJ iiusiness 
InforEiiatiofi , hiC. Trns dato U^^^ covers 
deproxiridtely journals s:iccidliiino in 
the aruds of finance, f^ian acie ^^ t * econon- 
ics, statistics, business law and r^arfcet- 

riASU/r^FpgHM AT f-MT 

CcifiDuter-baseJ b:bl:of]ra:>ti)c services in- 
cluding UlFORM add a new dimension tc^ in- 
formation retrieval traditiondUy per- 
fOrKLed by <^^nadl tecan io^ues . r.^SIC stfr- 
vices, avail s^ble throjTii the ''MT Lib- 
rarit>s. enaole yt'J to ui^ploy a :TOre qt,- 
haustive conbi'tcitSun of retriov^il 
pdrai^^eters at relatively loh cost to 
protjLiC*? r^aj^iji and 7lL^J!. ll'rJj'l.^'^liLL ^ ^ ^ 
resLT 1 ts . 



The lliFORM data Li^e i *i acces^^iiU in tne 
interactive on-lino r^od^. EUT I f o rrna t i on 
5Ptc i .jI i sts are avdiUtto to ^^,sist you ?n 
tht use of these 



tlon ^^.bOgt tytiQ^j^ 
a s so^-i ated costs 
af>poTnt>:ient wi th 
1 . contact t^^e 
of f 1 \:c : 



services. For ir^forj-^a- 
ot" services avail iiiJle an 
, iM\<! te arrinne ior an 
an I n f orna L i on 5. P'jc i a 1 - 
r^ASilC CoDv ruinator * s 



2^3 774G 
Rook 10-^00 



IhF0SMATIC:J CCflTER 

f\ Prfinra.T" of tno TK'w Fn^land ho^ird of 
11 i ^limr E. jc J t i on J Av ] C 1 5. tuptjorted 
by Uic ^it, 1 1 ii ^ Science' Found ; t i on 
undo r Cr <irt t . Gri3 ?;:'Ju - 



Tno INFORM datd oa^c rrjflects 

SiVO COVCrdTL* of L.jG LJS^reSS 

ttirougn references to feature 
from well ^vnown jcurnaJs 



iiASlC/i;iF0R1 



SUi^JECT ARCAS 

CorHpr£.Mien- 
1 i tora t ure 
a rt i c I tiS 
incl udi n*i : 



banking 

Dests Hev.ie^v ii/l> 

Ues ts fie V i ew P/l 

Has i noss jIo r i 7C ns 

Cdl i f orn i a Ma na'^tJi^en t Re vi ew 

Duns .^ev i t'.-: 

Fortune 

Uar rcl Uu s u:>^ s Rev i ew 

Personnel 

Pe rson no 1 Jctjrn a I 

Pjblic vci^ilie; Fortnightly 

Sales Mdndve:-;?n : 

SI oan Mana^je^ 'irn t f\cv i 

Te cnn ol ogy hc v i w J T ) 

Journal of r i. i nr; 

Journal of T<ixr*tion 

;jat ions Bust ness 



COVCRAGC 



The IhFOR:: Jat^ b'i:e covers the najor 
bus 1 n€ss -rar^ cg-,^: ^'i^n t -re 1 a ted literature 
published in ovci^ 2^0 journals since 
AuguDt 1 U7 K 

The ::;FC^:': file com toi ns a r pro / i no t c 1 y 
10 jOOiJ rttordS * Ar avcr<^ u of G'JC 
records arc riow .ii^lduci ron^i^ily ^y Aj I into 
the 1 :.FDp:; rile* Upd 3t(?s are v.'arcnabl r 
ihdopun Jor; t ly to T'^'ovi Jo a cy rrt^n t av/art;' 
n'^ss Svr V i cc- . 

The r**'jr/: recdrj contains tho liile» 
<JuUwjr . al traci % th& 1 i c a t i c i t a t i on 
an^i ^u^J^J"^► in jO:»iri<i e H as jtht-^r 

c dtt'iiri r i o of i ii f CM [ t H^n . 
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Access OPJJPJ} 

■lASIc's computer based b i b 1 i oq r;^ !) Ii i c data 
baso on business rtrliUed litcriilure is 
dvailablo for search in an on-line mode, 

SERVICES AVAILARLE 



Current Auareness and retrosPectWc s 
services trtilorod to your specific in 
csts, are now availoulc at tho T Li 
raries. The Cu rrent A v^rone^^ j orvi r 
t'rov i ddS per i 00 i c no t i f i ca t i on o'f il'h 
recent putjl ) ca t i ons whici) matcn t>je s 
scribin:; researcher's interest profil 
j^fet ro spec t i ve Sea rc h Serv ic eo, ^enOra 
CO ve~r 1 n tr sVvera i >ea"T*s or puFl i co t i on 
are also available on-Une, For nany 
the ciLatiCns oijtafned t;iroug;i yOur \ 
search, you nay obtain 
Libraries a pno tocopy 
hard co.Ty or microfiliti 
text of me document. 



through the -I 
[or f n t e car 
cojDy) of the f 



0 a r c ii 
r " 

b ' 

e , 

iiy 

of 
-SIC 
T 

s?s , 
uil 



COST or sErviCEs, 

Charges to acadonnc users will :,o hasea on 
the fol lowing ratc.^ ; 

Current Av/are'^css (avai ^ahle Lhroi;^;:i 
periodic on-line iedrcntn<; of the 
upda te s ) 

Retrospective Search 

' On-line; IQ? per connect hcur at 
terf.nnal ; n;init;iUiT3 charor SS* 

Infor nation Srec^^liSt assisiinco: 
33 per hour; iriininur; cnarge '^^A^^ 

Off-line co-rputer printouts 9 ten cents 
per pa^e. 



( A )D.uri the initial "brefik - in' i>cri i 
users v.'ill reCoWe a credit for the In- 
fornatio'^ Sp*':c i 1 i s t ' s liiiiO up to a 
max crco t of 3*^0,00, This offer 

exp i res OunO rJ74 , 



MASIC - A REGIONAL RlSOUftCE 

NASiC - The Mortheast Acjdenric Science In- 
formation Center - is belriG developed by 
the uevj Lnglaod tioard Tf Higher Edycatioa 
(hEBHE) to jjrovide the Northeast area with 
a central accos; :.oint to Ihe nation's 
growing and diverse in f or,njt * on resources 
in Computur-readatjltr for. ^ This dcvulot^- 
ment is Iteing aitied it^ff of the Assoc- 
iation of '^.csearcn Litraries* tne :'assa- 
chusctts Institute Technology and by 
other orgdni^utionS etnU consultants 
working under subcontract to riEGHE, 

By aggregating data bases and e^tisting 
information services, :^;.S1C provides 
co^prehens ive anu in-dcr^h services to 
users, :MS>IC thus aids in mcrejsinO 
the cap ab i I i ti L^s of tne Northeast's 
academ i c cot ■rrun i ty , 

The increasing availability of corriPu ter • 
readable data bases now r.^akes it possible 
for R&O p^^rsonnel to keeo up with the 
proliferation of proress^onal journals 
and with the gro^hno record of experi- 
mental ancJ statistical data. Cocputerf 
permit searching of Hundreds of thousands 
of references in tne tiro il would take 
a hunian rt^scarcnar rt^d onia page, 

NASIC AT niT 

To assist in neetin;j the infon^iation needs 
of the MiT community, nu-ber of coijrput- 
erized bibliogra;iiiic services ere already 
available for several t^JDjeCt disciplines. 
Others will soon be adoeJ and, eventually, 
all major fields c: research interest will 
be covered. 

For further tnf o:%.iation on all conputer- 
ba^^i^d servicc*j availaol^ at the Lih- 
rar^es* coiitdct the nfx^lC Coordinator's 
office; 



Room iO-'fOO 



:dSsochir*'j tto 0.- ) 31 
(CI?) :^ii/'.^'J/l 



fnOl 1 L*'j 1 {■ 
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meoline: 
at 
KIT 

MEDLINE 

The KIT Libraries offer a new service: 

Autotw^ted Search of the recent literature 

^^f blolcK]yf medicinei and related sciences 

Lhrough ^^F;DLI^iE/ an on-lir.e, computer-storod 
bibliographical infon^ation service operated 
by the National LibraJ^ of :iedic;.ner 

^EDtI^^E indexes the 1200 leading journals 
in the biomedical fields^ and is nore 
up-to-date than the published Inj&jc Nedicus v 

MEDtlNE: will be available to the MIT 
communityi on a fee-for-service basis, 

Monday through Friday # by appointnient 

at five divisional libraries: 

Science 

Barker Engineering 

Dewey 

Rotch 

Hup^ities 

An loforraatioo Specialist will be on hand 
at these libraries to explair^ and operate 
*he systeWi and to show you how to devise 
a r^ear^h strategy that will identify recent 
publications relevant to your research 
Interests, 

An on-lir.e search can produce a Printc»d 
list of references that you can take with 
you* Full bibliographies can ^Iso be 
printed off-line and £-;nt to you by rai 1 v 
Special searches of the :noat recent ci- 
tations to be included in a forthcomit^g 
issue of Index :tedicu^ are also available. 



For Infortnation about services ar.d costs* 
and to arrange an appointment with an 
Intornation specialist* call the MECl.ll^h; 
Cocrdiriritor *s Office : 253-77^6 * Room 
10-400. 



The data base 

ME6wne is an on*-line/ interactive biblio- 
graphical infonnation retrieval system 
operated ty the Natienal Library of Medi- 
' cine* The data base is stored in computers 
in different parts of the country^ and is 
now accessible through an international 
communications network known as TY'ISHARE* 

Searching with MEDLINE gives faster and 
more up-to-date results* It indexes'* the 
1200 leading biomedical journals^ using 
the standard ^edical Subject JleadingS/ 
which are arranged in categories such as r 

Anatomical Terms 

Organisms 

Diseases 

Chemicals and Drugs 
Psychiatry and Psychology 
Biological Sciences 
Physical sciences 
Health Care 
Biochemistry 

MEDtlNE includes about 60 percent of the 
material in Index >!tidLcus .* It covers the 
last three years and is updated monthly^ 
Kew citations are available several veeka 
before they can appear in the printed 
index. 

The system now includes about 500,000 
records, each of which contains these 
i terns : 

Author 
Title 

Journ:il citation 

Year 

t^angua^e 

Subject headings 

The l^.DLrUT syjitem is more versatile than 
the ordin':it7 printed index because it may 
be searched not only by subject and author, 
but in sevetal ot;^*v^r ways. Subject headings 
and search results ray be conbined In 
various ways to achieve a cIopo .Tatch with 
your re;if;areh interests * 



wote, however, that full index Modicus 
coverage is available with the SDlLlhE 
file described on opposite p:ige. 
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Kinds of Sexvlce 

KEDUHE is normally tised for An on-line 
search of the corplctc data base, with an 
ir^Riediate printout of a list of ail doc^' 
mcnts for t'le last three years that match 
the user's request. 

A printout can present any conbination of 
the bibliographical items included in the 
records. It can be a list of titles 
only, or of authors and titles # or it can 
include all the information stored* If 
a list is long and not needed ir-^ediatdly , 
it can be pri^nted off-line at reduced 
costs and sent by nail* 

The on-line* interactive feature of meDLIHE 
TtieaJis that the user is in continuous 
conversation with the computer so that he 
can modify ^d reJine his search as he goes 
along* A skillful operator can use the 
syGtem in an exploratory way to improve 
the effectiveness of the sear<:h* A first 
attenpt often yields a list too long or 
too short to be useful. The Information 
Specialist can suggest various techniques 
for broadening or narrowing the search , 
and various ways of coit^bining lists to 
identify the relevant documents. This 
kind of exploratory uork with the aim of 
refining the definition of the biblio- 
graphical problem is one of the DOSt 
important uses of the system. 

The primary task of the Information Spe- 
cialist is to assist you in translating 
your problem statement into the language 
of the system in order to help you to 
maximize the satisfaction you derive frow 
the system and to minimize the ccst of 
maicing a search* This user interaction 
with the Information Specialist r;ay cake 
half an hour. or rore to develop an 
appropriate search strategy, 

A subsystem knouii as sniLIN*E contains the 
citations to bo included in the fcrthccming 
issue of lnd^x_^:cdicus and is separately 
searchable. SDILt:;?: differs fron the main 
MEDLINE data ba&o in several ways; it 
covers the full rar.ge of journcils in I ndex 
Medic^jr: ra*:her than *hc MEDLi;:E Gc:ec:.ion, 
and each si<;nificant g:ord in t^ie title is 
seat^:hable* (in ^2:GLI^iE^ titles arc not 
dire<:tly searchable). 



i 



Cosjt 

The chargo for assistance by the Information 
Specialist is $3 per hour with a minimum 
charge of $5* * 

In addition! the charge for title spent using 
the terminal is S18 per hour with a minimum 
charge of S5* One-third of this fee goes 
to the National Library of Medicine which 
is subsidizing a ra^or portion of the 
*ME0LINE costs. The remainder goes to M,I*T* 
to r*icover its costs* 

There is also a charge for off-line print- 
outs at the rate of ten cents per printed 
page* 

A typical search night take a half hout at 
the terminal and a total of cne hour with 
the Infon^ation Specialist for a total 
charge of S17,* 

Other Data Ba^cs 

Tlie KEOLIWIL service is one element of a 
comprehensive program of bibliographical 
information-retrieval services now bolnq 
planned by the 5ilT r*ibtaries, to cover the 
major fields of research interest at MIT, 
The program is designed as an integratt»d 
service with a number of different data 
bases all available from the same terminal 
under the guidance of an experienced 
Information Specialist* 

The integrated .VIT service is currently an 
experimental ^nodo in a regional network of 
informtion centers in university libraries* 
This networvt is kr.own the northeast 
Academic Scienco Inforraticn Center t^:n*IC) , 
and is being developed by thn tCov England 
Board of Higher Kdu<:ation und<:r a grant from 
the National Scionce rcundation* with the 
cooperaticn of the yiT Electronic Systems 
Labor a tory . 

At presents ^3:oLI^*t is not a part of the 
MASIC system* bet is* madf^ a\*ailahlc at the 
same termir*tjlfl at ysT thtOugh the cooper- 
ation oi thf* ^'-.^w lincrlai.d F:eciioriai Xcdical 
Library Service tijrFXLS) . 



*^urlrifj tl.o ir.i',:nl "Lioak-in" period, aca- 
dcric Lr^ers ■*"iil rf^ccivc a crodx^- for the 
Inforrati^n S^.r\:iali3t ' n tir.-^ up to a 
maximun credit, r^t S^CCO. I'hi*-. cixdlt 
cKpirr.tj in June 107*1. 
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SUMMARY 

This report gives the results of a study conducted by 
QEI f Inc. for the management of the program for the North- 
east Academic Science Information Center (NASIC) which has 
been initiated by the New England Board of Higher Education. 
The NASIC program is supported by the National Science Found- 
ation and is intended to serve the science community of the 
ten-state area of the northeastern United States. The pro- 
gram was initiated early in 1973. 

The purpose of this study is to provide to the NASIC 
management such background information, together with appro- 
nriate discussion of options and alternatives ^ which could 
assist it in making decisions regarding the kind of inform- 
ation services to be provided by NASIC and methods to be used 
to provide these services. A particular emuhasis in the study 
is a review of guidelines, acceptance criteria and methods for 
evaluation as they have a bearing on proposed NASIC services and 
uroduc ts . 

The thirteen sections of this report and the attachments 
are listed in the Table of Contents. Each section discusses 
a particular asuec t of NASIC information servicf^s onerations . 
A brief summary for each section is given below. 
A. NASIC Ob.lectives and Some Factors Influencing the Attain- 
ment of these Objectives (pp. 14-18) 

The function of NASIC is to improve accessibility of in- 
formation needed by the science community in. the Northeast 
Hnited States. The objective includes two aspects: 
O 
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1. Improving the accessibility of informatrion: search- 
ing of accumulated files of data or references in re- 
sponse to specific information needs or requests: 
retrospective searching . 

2 . Improving the exposure to science information : pro-* 
viding service to inform users on a continuing basis, 
of nev^ly available data or inf orination : current aware- 
ness service . 

In these activities, MASIC has a role essentially of that 
of an interface between the scientific community and the uni- 
verse of available bibliographic and data resources. 

In terms of effectiveness, the goal of NASIC is to provide 
the maximum level of exposure/accessibility to relevant data 
bases for the science community. In terms of cost-effectiveness 
the NASIC goal is to achieve maximum exposure/accessibility per 
dollar expended. 

This section discusses three broad factors that will exert 
a major influence on the performance of NASIC: 

What data bases NASIC makes available. 

How these data bases are acquired. 

How the data bases acquired by NASIC are made available 
to potential users. 
B. Factors Affecting the Performance of Information Retrieval 
and Dissemination Systems (pp ■ 19-34) 

This section reviews the factors affecting system perform- 
ance relating to two types of information retrieval operation: 
the deleg;ated search and the non-delegated search . The major 
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factors which influence the success or failure of the delee,ated 
search are shorn in Figure Z: 

Coverage and appropriateness of the system. 

User's interpretation of system capabilities and limitations. 
Mode of interaction with the system. 
User's ability to describe his needs. 

System assistance and guidance in formulation of request, 
(request forms , interviews , interactive procedure) . 

Searcher's interpretation of user requirements. 

Ability of the vocabulary to describe concepts occurring 
in requests. Help given to searcher by vocabulary struc- 
ture > 

Searcher's own ability to construct logically sound and 
complete s trategy . 

Capabili ties of searching software . 

Indexing policy Ce.g>, exhaustivity level). Indexing 
quality and accuracy. 

Searcher's interpretation of user requirements. Quality 
of the user's request statement. Quality of the docu- 
ment surrogate , 

Major factors influencing the performance of a non-delegated 
search are also covered in this section and are given in Figure 3. 

Knowing what these factors are assists in making decisions 
relating to proposed NASIC services and products. 
C. Criteria Relating to the Selection of Data Bases (pp. 35-54) 

A major emphasis in the NASIC program will be the provi- 
sion of services based upon machine-readable data bases. The 
number of data bases available has increased dramatically in 
the last few years: 26^ machine-readable data base systems 
were identified in a recent survey. 

This section reviews the question of how NASIC can deter- 
mine which data bases to select and with what order of prior- 
ity. Criteria affecting the selection of data bases are dis- 
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cussed in detail and are summarized in Table 1. Criteria are 

discussed under the following headings: 

Subject Matter of Data Bases 

Cost Factors 

Quality Considerations 

Coverage 

Time Factors 

Indexing and Vocabulary Factors 
Implementation Factors 

D. Acquisition of Data Bases or Data Base Services : A lterna- 
tives (PP^ 55-65) 

NASIC may offer services based on acquiring a data base 
for in- house use or on acquiring services from that data base 
from a service center already in existence. 

Data bases are available on a straight purchase basis, al- 
though the majority must be leased or licensed . Generally a 
leasing arrangement allovjs the recipient organization to offer 
services for its own staff purposes only. If services are to 
be offered to a wider community on a fee basis > a licensing 
agreement must be made with the data base supplier. 

Various aspects involved in the acquisition of bibliograph- 
ic services from existing suppliers are reviev;ed in rh^s section. 
NASIC will be faced with various service options and these are 
discussed. v^ere service from a particular data base is already 
available from a service center this mode of access will be the 
preferred one. Where a choice exists between purchase of service 
from another service center or from the producer of the data base, 
specific cost comparisons must be made. 
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Guidelines on the Choice of Service Centers (pp. fifi-Ql) 
With a number of different suppliers of services avail" 
able, NASIC must decide which information service centers to 
use. In this section various considerations relating to the 
choice of service center are discussed. 

The comparison or evaluation of service centers is closely 
related to the matter of what criteria users of information 
services apply for evaluation purposes. Table 4 includes a 
summary of some of these criteria and a discussion of these 
is covered in pp. 66-91 of this section. The nrinclpal criteria 
include: cost, direct charges arid effort involved in use; 
response time ; quality considerations ; coverage * completeness 
recall, precision, novelty, accuracy of data; cost-effectiveness : 
cost per reference supplied, cost per new relevant reference sup- 
plied . 

A major decision NASIC must face for several data bases is 
whether access should'be made through a center offering off-line 
batch services or through a service center making the data base 
available for direct on-line interrogation. The various aspects 
of on-line services and alternatives are reviewed in this section 
in pp. 71-82. Table 5 includes a typical schedule of commercially 
available on-line service. Table 6 through 11 include summary 
data about the characteristics of current awareness system 
features and search and output characteristics which must be 
taken into account in determining what services will be of in* 
terest for NASIC operations . 

To reinforce and supplement the material of this section 
^ a comprehensive report on "The Present Status of On-Line Inter- 
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active Retrieval Systems" by Lancaster, is provided 

as an attachment: in pp. 15a-206 of this report. 

F. Some Considerations Relating to NASIC Products and Serv- 
ices (pp. 92-100) 

This section reviews certain kinds of information services 
which NASIC will wish to consider for its operation. Information 
services are of two broad types: Current awareness services (sue 
as selective dissemination: SDI) and Retrospective search serv- 
ices (on-demand) . 

It seems likely that NASIC should give first priority to 
the SDI aspects of its services. For SDI services the train- 
ing of NASIC information services librarians will be of high 
importance. On-line SDI, group SDI, service to provide inform- 
ation on current research in specific disciplines are topics 
covered in pp. 93-95 of thi? section. 

In retrospective searching quick-reference searches and 
comprehensive searches can be offered. An important tool to 
be generated for search services of this kind V7ill be a printed 
guide to available resources. 

Discussion regarding an interactive on-line capability at 
NASIC headquarters that is open to members of the scientific 
community, a referral service, support for the creation and 
exploitation of personal fi les , are topics covered in pp . 98-100 
of this section, 

G. Software Aspects of Information Service Center Opera- 
tion (PP- 101-117) 

Operations and procedures necessary to provide information 
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services desired of NASIC will be largely computer-based. 
"General criteria for selecting and evaluating computer soft- 
ware are presented in the section in order to assist NASIC 
in choosing among possible alternatives- The principal com- 
puter program areas of interest to MASIC are: input/output 
routines, search routines, data management routines, data- 
base transformation routines- In Table 13 evaluation and 
selection criteria for information retrieval software are 
summarized. These criteria include : documentation* reli- 
ability, throughput time, availability of updating, cost> 
maintenance, operating time, ease of usage, uniqueness, etc- 

H. Software Considerations: Data Base Format s (pp ^ 118-1?5) 
It is desirable for an information service center to con- 
vert all of its data bases into the same form and format in 
order to minimize searching software and processing. This 
topic together with discussion of various record formats and 
the use of file inversion to speed up processing, are covered 
in this section of the report. 

I. Software Considerations: Search Operations (pp. 126-131) 
The difference in search operations for current awareness 

services and retrospective searching are reviewed in this sec- 
tion. Particular emphasis is placed on the advantages and dis- 
advantages of on-line or off-line operations for each type of 
search. 
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J ■ Software Considerations : Input /Output Processes > Data 

Management Requirements (pp . 132-136) 

This section discusses input/output routines and some 
data management requirements of a typical information serv- 
ices center . 

Input routines such as syntactic analysis of English 
input and profile inversion are discussed. The advantages 
of "table-driven" output routines are mentioned and the 
data managemeat aspects of service center operation concern- 
ed with such items as billing, accounting and privacy are in- 
cluded in p. 136 of this section. Because of the large size 
of data bases^ the advantages and problems associated with 
means for speeding up processings such as grouping of requests, 
is discussed . 

K. Comniunications Aspects of Information Service Center Op- 
erations (pp. 137-140) 

The successful operation of an information services cen- 
ter will be greatly dependent upon the nature of its support- 
ing communications system. Various types of communication of 
interest to NASIC are noted in this section and there is dis- 
cussion of the use of direct dial-up versus dedicated communi- 
cation lines . 

A serious problem will tend to develoo as a MASIC oper- 
ation grows, in terms of the volume of items handled - the 
interconnection p roblem at its headquarters processing facility. 
It is proposed that for this potential interconnection problem, 
the possibilities of cable technology using time"^iivision 
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multiple-access digital communication in a high band-width 
"hus" channel he kept in^view. 

L> Technology Chanj^o and Information Service Center Plan- 
ning (pp. 141-150) 

Because NASIC is beginning operation during a period ./hen 
computer and communication technology is undergoing rapid change , 
this section is in pp. 141-147 devoted to a brief overview of 
relevant technology TA?ith a summary of what this represents for 
NASIC operations and what the implications are for NASIC plan- 
ning. Some of the topics discussed in this section are: 
trends in computer technology, integrated network systems , com- 
munications , large memories , low- cost processors , low-cost 
terminals . 

It is emphasized that NASIC should set up certain experi- 
mental subsystems where possihle, to provide a reasonable inter- 
play wich evolving technology. Consideration should be given 
to the use of NASIC as a test-bed situation where new develop- 
ments could be tried and evaluated. It is douhtful that a * 
NASIC operation based on *'s t ate^of - the-art" methods and means 
could provide coverage, speed of response and quality of service 
especially if a sizable user community is to be served. 
M> Some Management Aspects of Information Service Center 

Operation (pp. 151-153) 

In this section of the report there is discussion of sev- 
eral management aspects of interest to NASIC: objectives for 
operational support, objectives for the user market, kinds of 
service and promotion. An operation that will attract users 
who are capable and willing to pay for services, a market 
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that includes users from the industrial and commercial commun- 
ity, a service that attempts to understand what the informa- 
tion need is and then focuses a means to meet that need, a 
promotional program that vigorously publicizes NASIC services 
and resoLircesi are all desirable objectives which NASIC man^ 
agement will wish to consider and attempt to achieve. 



A, NASIC OBJECTIVES AND SOME FACTORS INFLUENCING THE ATTAIN- 



MENT OF THESE OBJECTIVES 

In broad terms > the function of NASIC is to improve the 
accessibility of information needed by the science community 
in the Northeast region of the United States* a major segment 
of this community being composed of science professionals asso^ 
ciated with universities and colleges in the ten state area.^ 
This goal could be stated more precisely as a pair of closely 
related objectives* 

1. To improve the accessibility of information relevant 
to the Northeast science community. 

2. To improve the exposure of the Northeast science com- 
munity to relevant science information. 

These obj ectives really form two sides of the same coin . 
The former implies the ability to make specific* relevant in- 
formation and data readily available when the need for it 
arises. That is* in response to a specific request made* NASIC 
will initiate the searching of one or more appropriate data 
bases in order to deliver to the requester the data, documents 
or document references that appear to satisfy his immediate 
need. This type of service is frequently referred to as a "de- 
mand" or "on demand'' service; it involves the searching of accu^ 
mulated files of data or references (retrospective searching). 
The second broad goal* improving exposure, inplies a service 
more dynamic in nature* In this type of service the information 

^Maine* New Hampshire. Vermont* Massachusetts* Connecticut, 
Q Rhode Island* New York, New Jersey* Pennsylvania* Delaware. 



center does not wait for people to approach it with specific 
demands. Rather, it attempts to inform users on a continuing 
basis of newly published literature of direct relevance to 
their current professional interests. This type of current 
awareness or alerting service is perhaps best exemplified 
by programs for the selective dissemination of information 
(SDI) , in which a computer Is used to match the ''interest 
profiles'' of users against the characteristics of documents 
recently added to a particular data base. 

In both of these broad activities, NASIC will play a 
similar role. The role is essentially (see Figure 1) that 
of an interface between the scientific community and the 
universe of available bibliographic and data resources. 
The major component of these resources, and the component 
upon which we will concentrate in this report > will consist 
of data bases available in machine^readable form. However, 
in its overall service function, NASIC will certainly need 
to draw upon other resources; i.e., manual files of one type 
or another. 

It is clear from Figure 1 that NASIC is involved in two 
interfacing operations. On the one hand, it interfaces with 
the members of the user population; on the other hand, it 
interfaces with the producers and suppliers of data bases 
(information wholesalers) and/or with various middlemen (in- 
formation retailers) who already offer services from these 
machine-readable files . 

In terms of effectiveness , the goal of NASIC is to pro- 
vide the maximum level of exposure/accessibility to relevant 



ERIC 



-15- 



NASiC INTERFACE FUNCTION 



Universe Of 

oe^ 



Resources 
Bibliographic 
And Other Data 



^ NASIC ^ 



INTERFACE 
FUNCTION ^ 



Science 



Community 
Of The 
Northeast 



FIGURE 1 



data bases for the science community in the Northeast. In 
terms of cost-effectiveness , the NASIC goal is to achieve 
maximum exposure/accessibility per dollar expended. A high- 
er level objective is to provide to the science community 
the maximum possible benefits from the exploitation of in- 
formation resources within the science community' s available 
operating budget (i.e., cost-benefit considerations). Unfor- 
tunately, as pointed out elsewhere by Lancaster the benefits 
of information are notoriously difficult to measure. Moreover, 
in actual practice, the distinction between cost-effectiveness 
analysis and cost-benefit analysis, while real, is not always 
absolutely clear and is therefore difficult to make. 

This report will be concerned largely with various consid- 
eracions relating to the effectiveness , cost-effectiveness and 
(ultimately) the benefits of NASIC operations. 

Returning to the simple conceptualization of Figure 1, it 
is possible to identify three broad sets of factors that will 
exert a major influence on the performance of NASIC. These 
factors are: 

1. Which data bases NASIC makes available, or more part- 
icularly, NASIC priorities for the acquisition of data 
bases and for the level of accessibility provided to 
the data bases. 

2. How these data bases are "acquired" by HASIC. Here 
we are concerned both with business considerations 
(most convenient and inexpensive modes of acquisi- 
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tion of data base or data base services) and with 
technical performance considerations (e.g. , a part- 
icular file may be searchable by a number of dif- 
ferent searching systems and one may be more effi- 
cient than the others) . 
3. How the data bases acquired by NASIC are made avail- 
able to the potential user population. Here our con- 
cern is with the products and services of NASIC. and 
how these service^? are presented to the user commun- 
ity. Pertinent considerations include possible modes 
of access to NASIC services, including interactions 
between NASIC representatives and the user communis 
ties, as well as telecommunications and networking 
considerations . 
Since NASIC is directly concerned with the provision of 
information services, it is appropriate to give some consider- 
ation here to the factors that importantly affect the per- 
formance of an information service operation. These factors, 
which must be borne in mind in later decision processes re- 
lating to selection of data bases and modes of access pro- 
vided to these data bases, are discussed in the next section. 
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B. FACTORS AFFECTING THE PERFORMANCE OF INFORMATION RE- 



TRIEVAL AND DISSEMINATION SYSTEMS 

From the viewpoint of factors affecting system perform- 
ance it is convenient to identify two types of information 
retrieval operation : 

1. The delegated search. This is the situation in which 
the person needing information (i.e., the scientist 

or practitioner in some other professional field) del- 
egates the responsibility for finding this information 
to a second person, usually a librarian or other inform- 
ation specialist. This mode of searching is the only 
one possible in machine-based retrieval or dissemina- 
tion systems operated in an off-line , batch -processing 
mode . 

2. The non-delegated search. This is the situation in 
u'hich the person who needs information conducts his 
own search directly This mode is exemplified by a 
scientist's use of a printed index (e.g., Index Med- 
icus ) or by his interrogation of a remote data base 
by means of an on-line terminal. 

The major factors influencing the success of a delegated 
search, with special reference to a search in a mechanized re- 
trieval system, are depicted in Figure 2. Whether or not a 
requester approaches a particular information system or center 
in the first place is dependent upon his expectations regarding 
the scope and coverage of the service. Presumably he will not 
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FIGURE 2 



approach the system unless he feels that the collection is 
likely to contain the type of information or data he is seek- 
ing. Having decided to consult the system, he must make his 
needs known by means of a verbal request. The quality of this 
request (i.e,, the degree to which it actually matches his 
information requirement) is dependent upon: 

1. His interpretation of system capabilities and limita- 
tions. There is a strong tendency for a user to ask 
for what he thinks the system can give him rather than 
to ask for what he is really looking for. 

2. His mode of interaction with the system. 

3. His own ability to describe his needs, to expre>is him- 
self. 

4. The degree of assistance and guidance given to the re- 
quester by the system. Such assistance can take various 
shapes: a carefully structured search request form, a 
formal ''request interview** process , an iterative search 
procedure, or some type of user training program. 

The request having been made to the system, it must be trans- 
lated into ^ formal search strategy by a member of the information 
staff (search analyst). Now a new series of variables, affecting 
the recall and precision of the search, come into play: 

The analyst's own interpretation of what the user real- 
ly wants (which may be accurate or inaccurate). 
The ability of the vocabulary to express the user's need. 
For example, the user may specifically be seeking art- 
icles on '*argon arc welding** (and the search analyst rec- 
ognizes this) but the vocabulary may only capable of 
expressing this at a higher generic level - ^'shielded 
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arc welding*' or "arc welding" - and thus precision 
failures are inevitable . 

3 . The ability of the search analyst to recognize and 
cover all possible approaches to retrieval. To take 
a simple example, the requester may be looking for 
articles on possible adverse effects of commonly con- 
sumed beverages or components thereof. The searcher 
uses the terms "caffeine", "coffee", "Lea", and "theo- 
phylline", but forgets about the possibility of "cacao" 
and "theobromine" and thus misses some of the relevant 
documents - 

The "level" of search strategy adopted . The searcher 
can choose to use a broad strategy (leading to high 
recall but low precision) or a tight strategy design- 
ed for high precision (but usually at the expense of 
a low recall) or a compromise between the two extremes. 

4. The capabilities of the searching software. 

When the search strategy is actually matched against the 
data base (i.e., the anarch is conducted), further factors affect- 
ing performance come into play. One important performance factor 
is that of indexing policy, particularly policy regardii":g G:^.haust- 
ivity of indexing (which really equates with the number of index 
terms or other access points provided) . Perhaps the exhaustivity 
of indexing is inadequate to allow some of the relevant items 
for a particular request to be retrieved. Inaccuracy of indexing 
(omission of important terms or assignments of terms incorrectly) 
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will also lead to recall or precision failures,* The charac- 
teristics of the vocabulary affect the indexing process as 
much as they affect, the searching process. An indexer can 
only adequately represent the concepts occurring in a docu^ 
ment if there are appropriate specific terms available for 
him to use. Lack of specificity in the vocabulary will usu- 
ally cause precision failures but can also lead to recall 
failures. Further, the vocabulary must be capable, to a cer- 
tain extent, of showing the syntax of the terms assigned in_ 
indexingi thereby avoiding at least some of the precision 
failures that would be caused by false coordinations or incor- 
rect term relationships. 

Finally, before the results of a search are submitted to 
the requester , rhe analyst may screen the output and eliminate 
items that appear to be irrelevant with the object of improving 
the precision of the search to the end user. How successful 
this screening operation is (i.e., how much precision can be 
improved without having too serious an sffect on recall) depends 
primarily upon the accuracy of the analyst's interpretation of 
the requester's requirements* Secondarily, the success of the 
screening will be affected by the quality of the document sur- 
rogate from which the analyst is working. 

Of course, these various sources of failure are cumulative^ 
For a particular search conducted in a retrieval system, some of 
the relevant documents may be missed by the very fact that the 
user's request statement is too restrictive and inadvertently ex- 

recall failure is the failure of the system to retrieve a rel- 
evant document or other item* A precision failure is the reverse 
of this, the failure of the system to avoid an irrelevant item. 



eludes certain items. Others may be missed due to poor search 
s trategy , vocabulary inadequacies , indexing policy , and index- 
er omissions* Finallyi the analyst may eliminate some more rel- 
evant items in his screening process. With so many possible 
sources of loss, it is little wonder that systems do not on 
uhe average operate very close to 100% recall. A similar cum- 
ulative effect occurs to prevent us obtaining 1007o precision. 

The performance factors illustrated in Figure 2 are rele- 
vant to all types of delegated searching systems, manual as well 
as mechanized, dissemination systems as well as retrospective 
searching systems. It is important to keep these factors in 
mind when making decisions relating to the selection of data 
bases and the provision of services from these data bases ^ 

It is clear, for example, how the characteristics of the 
data base itself exert a considerable influence on the perform- 
ance level of services provided from this data base. In the 
first place, in its subject coverage and level of treatment, 
the data base searched must be appropriate to the information 
need for which the search is being t;onducted. Selection of 
the most appropriate data base to search is clearly a first 
step that is critical to the entire information seeking process. 
In interrogating a particular data base, the searcher is limited 
in what he is able to do by certain inherent characteristics of 
the data base itself. The most important of these characterist- 
ics are the exhaust ivity of the indexing and the specificity of 
the vocabulary used in indexing. An exhaustive representation 
of a document is one that presents a fairly complete searchable 
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representation of its subject matter. The most exhaustive 
"representation" of a document is the complete text of the 
document itself, available in machine-readable form for sear- 
ching on a word-by-word basis (as, for example, in certain 
legal retrieval systems), A fairly full representation would 
consist of a searchable abstract or a set of, say, 10 - 20 
index terms which collectively represent at least the major 
subject matter of the document. A record of low exhaustivity 
would consist only of the title of a document or a small num- 
ber of index terms representing only part of the subject mat- 
ter covered. By "exhaustivity of indexing", then, we mean 
essentially the number of access points provided to a biblio- 
graphic record. Exhaustivity is the prime factor governing 
the recall capability of a retrieval or dissemination system 
(i.e., the ability of the system to retrieve relevant ref- 
erences) since it is clear that any particular record can only 
be retrieved by the access points provided in the data base. 
That is, a data base in which the records are exhaustive repre- 
sentations of the subject matter of documents is capable of pro- 
viding a high recall whereas a data base in which the records 
are nonexhausti ve representations of subject matter cannot pro- 
vide a high recall except at the expense of an unacceptable 
level of precision (to retrieve a high percentage of all rele- 
vant items on a particular topic we would need to retrieve a 
very large number of items most of which would not be relevant 
to the specific subject of the search) . 
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The precision of a retrieval system is largely governed 
by the specificity of the vocabulary used to index subject mat- 
ter. A highly specific vocabulary will allow a high level of 
precision in searching whereas a nonspecific vocabulary dooms 
a system to low precision. That is> a searcher cannot inter- 
rogate a data base any more specifically than the vocabulary 
of the data base allows him to. For example* a searcher in 
MEDLARS may be looking for information on acute frontal sinus- 
itis. The vocabulary of the system does not permit indexing 
of this subject matter at this level of specificity. The most 
specific relevant index term is ^'sinusitis" > which means that 
the entire class of documents on ^'sinusitis" must be retrieved 
in a search on ''acute frontal sinusitis'^ Since most "sinus- 
itis" documents will not be relevant to the specific topic of 
"acute frontal sinusitis", it is obvious that a search on this 
topic will have a low precision- In general, natural language 
data bases (e.g., those providing a complete searchable abstract) 
provide high specificity and therefore are capable of high pre- 
cision in searching. Controlled vocabularies, such as thesauri 
or lists of subject headings, tend to be less specific than nat- 
ural language, and thus tend to operate at a somewhat lower lev- 
el of precision. On the other hand, natural language data bases 
may create greater problems of semantic and syntactic ambiguity. 
These problems take two forms : false coordinations (the situation 
in which two terms on which a search is conducted, while present 
in a retrieved document, are essentially unrelated in that doc- 
ument) and incorrect term relationships (the situation in which 
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two search terras, although related in a document^ are related 
in a way other than the relationship sought: by the requester 
- e.g., the difference between "reading" stimulating "epilepsy" 
and the "reading disabilities of epileptic children") . 

Two further data base characteristics exert influence on 
the performance or a retrieval or dissemination operation . 
The first of these is the quality of the indexing. Whereas 
exhaustivity of indexing is controlled by policy decisions 
made by system managers, the qualTty of the indexing (or lack 
of it) is controlled by the individual members of an indexing 
staff (assuming that some human indexing operation does ac- 
tually take placa). Indexing errors are of two types: 

a. The omission of an important index term, or 

b. The use of an incorrect term, i.e., a term inappro- 
priate to the subject matter covered by a document. 

If a data base is well prepared, with some built-in quality 
control procedures, indexing errors are likely to be relatively 
infrequent. But, some machine-^readable data basts are far 
from error-free and their effectiveness is diluted as a result. 

The second characteristic relates to the "indicativity" of 
the record stored in the data base. It is important that the 
document record provided by a retrieval or dissemination system 
(in a printout or in an on-line display) gives a representation 
of its subject matter sufficient to allow a reader to determine 
its probable relevance to his own information need or Chat of 
someone else for which the search is being conducted. In gen- 
eral » titles are inadequate indicators of subject content; ab- 
stracts or the complete set of index terms assigned to a docu- 



ment (i.e., the "tracings'*) are somewhat better. Studies 
conducted by Project Intrex (see Marcus et al (2)) discov- 
ered that relevance decisio'is made on the basis of document 
titles agreed at the 60-707^ level with relevance decisions 
made on the basis of the complete text of these documents, 
whereas relevance decisions made on the basis of abstracts 
or lists of index terms agreed with those made on full text 
at the 70-907o level. Similar findings have been made by 
Saracevic (3)- The indicativity of a record is roughly pro- 
portional to the length of the record in English words* That 
is, the longer the record the more useful to the user it is 
likely to be as a predictor of relevance, but, of course, the 
more costly the record to store and to priat out. 

Turning once more to Figure 2, it can be seen that there 
are two further broad categories of factors influencing the 
performance of an information retrieval system. These fac- 
tors, not directly relared to the data base itself, are re- 
lated to the searching strategies used and the method of in- 
teraction hf?tween the user of the system and th*^. system itself. 

In any delegated search system a major factor influ- 
encing performance is the mode of interaction between the 
user and the system in the negotiation of the user * s actual 
information need. The result of inadequate interaction is 
that the search analyst (or other information specialist) 
is left with an imperfect representation of what the user 
really wants. Under these conditions a fully successful 
search is unlikely, and it is quite possible that the re- 
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suits will be completely unacceptable to the register. The 
major factors influencing the success (or failure) of the 
user- system interaction process are illustrated in Figure 
2 and were touched upon earlier. The problems of user- 
system interaction in the negotiation of search requests 
have been discussed in detail by Lancaster (4) (5). In his 
evaluation of MEDLARS. Lancaster discovered that imperfect 
user-system interaction accounted for about 257q of all the 
recall failures and \b% of all the precision failures occur- 
T*ing in the system. Thus, this was a major source of system 
failure as it will always tend to be in a machine-based del- 
egated search system. Clearly ^ this has important implica- 
tions for the procedures whereby the NASIC information serv- 
ices librarians interact with the user community, a point we 
will return to in greater detail later. 

The final set of factors influencing the performance of 
a retrieval or dissemination system are the searching strat- 
egies themselves - As indicated in Figure 2 , the quality of 
a search strategy is dependent upon a large number of factors » 
including the searcher's interpretation of user needs and his 
own ability to construct a str^iregy that is complete (i.e. , 
covers all possible approaches to the subject matter) and 
logically correct. However, the searcher is somewhat con- 
strained by various data base characteristics and by the prop- 
ertiey of the searching software available to him- Two im- 
portant data base constraints have been mentioned already: 
the exhaustivi ty of the indexing (which essentially governs 
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recall) and the specificity of the vocabulary used in index- 
ing (which, in an absolute sense^ controls the level of pre- 
cision possible). It is not possible for a searcher to over- 
come these data base constraints; that is, the degree to which 
he can vary a search strategy to produce a high level of recall 
or a high level of precision is largely dependent upon the ex- 
haustivity of the indexing and the specificity of the vocabu-^ 
lary. One major task faced by a searcher in a machine-based 
system operated off-line is that cf thinking of all possible 
approaches to the retrieval of material on the subject of the 
search. How successful he is in this is at least partly de- 
pendent upon data base characteristics again; that is, he is 
more likely to be able to devise a comprehensive strategy if 
the data base employs a highly structured vocabulary that ex- 
plicitly reveals hierarchical and other semantic relationships 
among terms- The thesaurus of such a system provides a very 
important aid to the searcher. 

Another constraint is placed upon the searcher by the 
searching software available. Obviously, the searcher is 
limited by the capabilities provided in this software (e.g., 
truncation capabilities , word proximity operators , nesting 
levels* and so on). The more flexible the software, the more 
features It has, the more powerful is the tool available to the 
intelligent searcher. Clearly these various factors niust be 
borne in mind by NASIC in the selection of data bases, the 
selection of service centers and/or the selection of avail- 
able software for in-house use. 
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While Figure 2 depicts the major factors influencing 
performance of a delegated search system (as exemplified 
by one operating in an off-line, batch processing mode) , 
Figure 3 depicts the major factors affecting the perform- 
ance of a nondelegated search Liystem, Figure 3 would, for 
example, be typical for an on-line retrieval system interro- 
gated directly by a scientist to satisfy his own information 
needs. This is in some ways a more simple situation. The 
"interaction'* failures that tend to be prevalent in dele- 
gated search systems are avoided, but the vocabulary and 
indexing factors are jus t as important in the on-line situ- 
ation as they are in the off-line. Apart from these data 
base constraints the major factors influencing the success 
or failure of a nondelegated on-line search are the search- 
er*s knowledge of data base characteristics (especially his 
knowledge of the vocabulary of tht, system), his ability to 
verbalize his requirement and to think of all reasonable ap- 
proaches to retrieval (assuming that he wishes to achieve a 
high recall), the assistance and guidance given by the sys- 
tem, his ability to construct logically correct searching 
strategies, and the capabilities of the searching software 
he is working with. 

If the on-line system is being used in the delegated 
search mode the situation is much the same as the situation 
in the off-line, batch processing search as presented in Fig- 
ure 2 . There is one important difference, however ; the off- 
line search is noninteractive and nonheuris tic ■ The searcher 
must think in advance of all likely approaches to retrieval, 
O Mistakes are relatively costly since the searcher will pro- 
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FACTORS AFFECTING THE NON-DELEGATED SEARCH 
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bably not knQW for some daysCi.e., until he receives his 
search printout) whether or not his search has been success- 
ful. If it has been unsuccessful, he must start all over 
again. In the on-line search, on the other hand, he can see 
immediately whether or not a particular strategy is "hitting" 
the type of item he is seeking. If not, he can modify his 
strategy immediately. Searching on-line is interactive and 
heuristic and mistakes are less costly because they are easily 
noticed and corrected. A certain amount of "browsing" is also 
possible in an on-line system. 

In all of this discussion we have concentrated _on biblio- 
graphic systems at the expense of data retrieval systems. This 
is deliberate. Bibliographic systems are much more complicated 
in that they are dealing with "soft" rather than "hard" data. 
In searching for documents (or references to them) the search- 
er is faced with problems of semantic ambiguity and the inher- 
ent imprecision of language. There is no answer from a bib- 
liographic system that is inherently and unequivocally "cor- 
rect". On the other hand, a system dealing with numerical 
data would generally be free from semantic ambiguity. There 
is usually one correct answer to a query posed to such a sys- 
tem (i.e., some numerical value). 

Nevertheless, some of the problems identified in rela- 
tion to the use of bibliographic systems may be equally rel- 
evant to the use of data retrieval systems. The user of a 
data retrieval system will also be constrained by data base 
characteristics, including the amount of data provided and 
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how this data is formated. The user will also be concerned 
with how comprehensive the data base is^ how accurate the 
data is, and how current it is. Another important consider- 
ation will relate again to searching software; that is » just 
how can the data be manipulated by the software available? 

We have gone to some lengths to discuss these factors 
that importantly affect the performance of all information 
retrieval and dissemination systems because it is important 
that all of these things be kept in mind in making decisions 
relating to NASIC operations. Clearly, we should try to iden 
tify the major factors likely to critically affect the per- 
formance of NASIC information services. Knowing what these 
factors are will help in making decisions relating to data 
base evaluation and selection, data base acquisition^ modes 
of interaction with NASIC users, and products and services to 
be provided. Some guidelines relating to these various deci- 
sion processes will be discussed in the ensuing sections of 
this report . 



C. CRITERIA RELATING TO THE SELECTION OF DATA BASES 



A major emphasis in the NASIC program will be the pro- 
vision of services based upon machine-readable data bases. 
Less than a decade ago, almost nothing of significance in 
the way of bibliographic or numerical data bases was avail- 
able in machine-readable form. The si4:uation has changed 
dramatically. The first general survey of available files, 
that of Carroll (6), listed 55 machine^readable files, and 
an Auerbach report (7), listed 34 suppliers of machinable 
bibliographic records, both of the lists appearing in 1970. 
An ASLIB survey of 1972 (S) identified 48 different services, 
with special reference to services available in the United 
Kingdom. The ASIS survey (9), published in 1973, provides 
information on 81 commercially available machine-readable 
files, while a list issued by the LARC Association (10), al- 
so in 1973, mentions 122 "available data banks for library 
and information services". Probably the most complete sur- 
vey, however, was issued by Computer ■ Sciences Corporation 
(11) in 1972. This survey mentions 169 information systems 
having machine-readable data bases, a'lthough 268 such sys- 
tems had actually been identified by the compiler. Conserv- 
atively, then, one could state that there exist on a worldwide 
basis at least 300 machine-readable bibliographic files, al- 
though not all of these are readily available for outside use. 
If we add data files to this, the total number of data bases 
will be very much greater. 
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With so many data bases to choose from, how can NASIC, 
or any other information center, decide which to make avail- 
able and in what order of priority? Disregarding any contrac- 
tual constraints > it is useful to give some general considera- 
tion to the subject of evaluation of available data bases. 
Some of the most important selection criteria are summarized 
in Table 1. 

Without any doubt> the major consideration is the sub- 
ject matter of each data base. The major concern of NASIC> 
or any other information center* should be to make services 
available in order of priority according to the anticipated 
demand for these services within the user community to be 
served. This implies the need to match data base content 
with the "subject profile" of the NASIC user coinmunity. In 
terms of disciplines and major subdisciplines such a profile 
can be constructed from faculty directories > college catalogs > 
membership lists* and similar publications. This type of 
broad analysis will yield the approximate numbers of poten- 
tial users of information services in general subject areas: 
chemistry , physics , medicine , electrical engineering, educa- 
tion, and so on. To assess the potential market for inter- 
disciplinary services (e.g., the NASA data base or a data base 
in the area of environmental protection), for services of a 
more specialized nature (e.g.> in spectroscopy > in toxicology, 
in crystallography), or for services covering particular types 
of material or data (e.g., patents, product catalogs) is rather 
more difficult and will involve the use of more sophisticated 
market analysis techniques . 



TABLE 1 



CRITERIA AFraCTING THE SELECl^XON OF 
MACHINE-READABLE DATA BASES 

1- Subject Matter Of Data Base 

Match of subject matter with subject interests of i^ser commmity. 

2. Cost Factors 

Cost of "acquiring" data base or services from it. 
Unit cost per user interest profile. 
Unit cost per retrospective search. 
Unit cost for group profiles. 

Cost of acquiring data base in relation to nunber of records provided. 
Cost of acquiring data base in relation to nurnber of access points pro- 
vided per record- 
Cost of data base in relation to "quality" considerations. 

3^ Quality Considerations 

Coverage 

Coverage by number of sources. 
Coverage by type of source. 
Coverage by nurtfcer of "items". 
Coverage by time span. 

CtJnpleteness in relation to specific topics of interest to user 
coittnunity. 

"Uniqueness'' of data base. Overlap with other data bases. 

' Time Factors 

Time lag in inclusion of sources. 
Freqtiency of file update. 

Indexing and Vocabulary Factors 

Specificity of vocabulary. 

Is vocabulary controlled? 

Are searching aids provided? 

Prevalence of semantic and syntactic ambiguity. 

Exhaustivity of indexing. I^anfcer and variety of access points 

provided. 

Accuracy and consistency of indexing. Observed error rates. 

4. Inplgnentation Factors 

Assurance of continuity. 
"Cleanliness" of data base. 

Ccsipatibility with in-house software and hardware. 
Amount of pre-processing needed. 
"Integratibility" with other data bases handled. 
Capabilities of searching software available. 



The cost of making a particular data base available 
must be considered in relation to anticipated demand for 
that data base. Some of these cost considerations will be 
considered in greater detail in the next section of this re- 
port. In general, the unit cost of any information service 
is extremely volume-dependent. This would be particularly 
true in the case of a data base that must be purchased or 
licensed directly and manipulated in-house on NASIC facili- 
ties. It will also be true, although to a lesser extent, in 
the purchase of services through an existing information re- 
tailer. That is, NASIC can and should expect to be able to 
negotiate reduced rates for a high volume of use of any part- 
icular service. Since extent of use of a particular service 
will be at least partly dependent upon the cost to the user, 
NASIC must endeavor to estimate a realistic level of poten- 
tial demand for each service and calculate the unit cost of 
this service Ce,g., per retrospective search, per SDI profile) 
based upon this expected level of demand. Top Implementation 
priority should be given to those data bases that, because 
of anticipated demand and estimated unit cost per service 
uni t, are likely to attract the greatest volume of business 
to the information center . 

Independent of expected demand, however, there are other 
cost effectiveness considerations that might be taken into 
account In the evaluation of machine-readable data bases. 
If a data base must be purchased, leased or licensed, the 
purchaser should consider what exactly he is getting for the 
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money invested. One possible measure of return on investment 
is the cost per searchable record. For example, the Science 
Citation Index tapes can be leased (source tape plus citation 
tape) for $20,000 per year. This buys access to approximately 
400,000 searchable records > an average cost of about 5 cents 
per year per searchable bibliographic record. In contrast, 
it costs $30,000 per annum to lease one year's worth of MEDLARS 
tapes, containing about 250,000 records, which gives a cost of 
rather more than 8 cents per record.* 

The size of the data base is not» of course, the only 
consideration . Another important cost-ef f ectiveness factor 
would be the number of access points provided per record since 
this, ultimately, controls the retrieval capabilities of the 
system. A data base providing a large number of access points 
per record (e.g.. a citation plus complete searchable abstract 
or a citation plus an average of 10 - 20 humanly assigned des- 
criptors) will generally be more costly to create and thus, 
presumably, more costly to acquire by purchase, leasing or 
licensing arrangements* Such a file would also be more diffi- 
cult to compress and thus more costly to store and manipulate. 

*It seems that most data bases lease for something in the range 
of 5 to 10 cents per item per year and that the leasing rate 
should generally not exceed this- Leasing rates for the Geo~ 
Ref files of the American Geological Institute work out to 
about 10 cents per record per year. For the METADEX files of 
the American Society for Metals the cost is 5 cents per record 
per year (i.e., for $1250 about 25,000 items are available 
O through the leasing arrangement) . 
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Nevertheless, in such a file, each record may be accessible 
from a large number of different approaches. Since, essen- 
tially, an information center acquires a data base in order 
to obtain convenient access to the individual items in this 
file, the cost of offering service from a particular data 
base needs to be balanced against the amount of access the 
file provides. From a cost-effectiveness viewpoint, the price 
of a data base must be related not only to the number of rec- 
ords the investment is purchasing but also the amount of access 
to the records that is being purchased. As an example, the 
MEDLARS data base, or a large portion of it, is indexed quite 
exhaustively. The file is expensive to lease but this cost is 
not necessarily excessive in relation to the amount of access 
provided. 

So far we have talked only of quantitative considerations 
- how many records are purchased, how much access is purchased. 
The quality of the data base must also be considered, although 
quali ty is not always easy to assess , Major quali tative consid- 
erations would include the following: (1) Coverage, (2) Time 
Factors, (3) Indexing And Vocabulary Considerations, (4) Con- 
tinuity , and (5) Ease Of Implementation. 
Coverage 

Coverage is a critical consideration and there are a num^ 
ber of dimensions to this problem. One we have already mention- 
ed is the number of individual items included in the file, A 
second is the scope of the coverage. Scope may be expressed. in 
terms of the number of different services regularly covered in* 
the data base (e,g,, number of journals indexed) and in terms 
of the range of types of sources included (journal articles, 
technical reports, patents and so on,) The National Library 



of Medicine, for example, regularly indexes close to 250,000 
items a year from about 2500 different sources , hut this 
coverage ts almost entirely from biomedical and general sci- 
entific journals . The coverage of technical reports is very 
limited, as is the coverage of conference proceedings or sym- 
posia (except those appearing in a regular series) and med- 
ically related patents are not included at all. 

Another dimension of data base coverage relates to time 
span. To be of any use at all, for most purposes of reLro- 
spective search, a data base must span at least 2-3 years 
of the literature, and it is probably only beginning to ap- 
proach a very high level of value when it covers about five 
years of literature. Hansen (21), for example, reports little 
interest from users in a search of CA Condensates tapes going 
back 2-3 years only. In the humanities and social sciences 
time span is somewhat more important than it is in scientific 
and technological fields > where literature and some data tend 
to be superseded rather rapidly. 

It should always be borne in mind that the great disci- 
pline-oriented services are not necessarily complete , or even 
close to complete, in specific topical areas falling legiti- 
mately rt^ithin their own scientific disciplines. Davison and 
Matthews (12), for example, found that twelve major indexes 
relating to chemistry and spectroscopy each covered only a 
very small proportion of a collection of 183 references known 
to exist on the topic of 'computers related to mass spectrom- 
etry". In fact> no single source included more than 407o of 
the known references. Chemical Abstracts was found to include 
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only 247o uf these references, Ref erantivyni Zhurnal Chimii 
only 5-57=^ and Chemisches Zentralblatt only 4.57o, For this 
reason it is important that NASIC services include the wid- 
est po<;sible use of specialized information and data centers, 
whether they have machine-readable filas or not, as well as 
the great discipline - or mission-oriented data bases. In 
the area of mass spectrometry* for example^ it would be im- 
portant to use the services of the Mass Spectrometry Data 
Center (of the United Kingdom Atomic Energy Authority) as 
well as Davison's own service at the Scientific Documenta- 
tion Centre , Dunfermline , Scotland. 

Bourne's (13) evaluation of the Bib liography of A^r i- 
culture concluded that this source included only 48 - 58 
percent of the available literature relevant to the inter- 
ests of agriculture researchers and that the material miss- 
ed was not at all of an evanescent nature but was predomin- 
antly unglish language material^ mostly from journals and 
conference proceedings^ and much from U,S, sources. An 
analysis of the coverage of material from USDA> a research 
laboratory > and several state agriculture experiment stations 
or extension services indicated that the Bibliography of Ag- 
riculture appeared to cover only 45 to 74 percent of this 
type of material. An extensive program of studies by MarLyn 
(14) also indicated that it is unrealistic to expect that any 
one service relatively broad in scope is likely to be compre- 
hensive in its coverage of some specific sub topic falling 
within the broad subject area. 
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Indeed, it would be very unusual if a broad discipline- 
oriented service did provide comprehensive coverage of some 
specific subtopic. The "law of scattering", first propound- 
ed by Bradford (15) in 1935 and since substantiated by num- 
erous other investigators (see Fairthorne (16)) , indicates 
clearly, that, while a relatively small number of "core" 
services are likely to account for a relatively large pro- 
portion of all the references on a particular topic, the re- 
maining references will be very widely dispersed over a great 
many sources » includirig sources that are very peripheral to 
the subject area covered by the core journals. In other words, 
the distribution of references over sources is hyperbolic in 
nature , This empirical hyperbolic distribution" is identical 
with the distribution of the use of words in published text» 
as obsei:ved by Zipf (17), 

It is likely that a number of different data bases may 
need to be used to achieve a really comprehensive coverage 
of literature on some particular topic. For example, Mont- 
gomery (24) found that four data bases collectively covered 
98?o of 3 sample of toxicology articles published in 1968» but 
the maximum coverage of the first service was 857,. Another 
factor relating to coverage is the "uniqueness" of the data bas 
There is a considerable amount of overlap among several ma- 
jor data bases. For example , there is great duplication 
between MEDLARS and EXCERPTA IVIEDICA and it has been discov- 
ered that at least 50.000 articles a year are abstracted by 
at least two of the services BIOSIS, CAS and Engineering In- 
dex (18), although this duplication is likely to be reduced 



in the future by means of cooperative arrangements between 
the publishers of these services. Certain other services 
(19) on the other hand, are relatively ''unique*'; that is, 
they overlap others very little* Bourne (20), for example, 
discovered that no other service overlapped the Bibliography 
of Agriculture by more than 207,. Duplication is not in it- 
self necessarily bad* Because of the interdisciplinary na- 
ture of many subject areas one would expect, in striving 
for completeness, that one major service might overlap an- 
other. Nevertheless, in establishing priorities for the 
implementation of services, NASIC should give careful consid- 
eration to matters of overlap and uniqueness and should en- 
deavor to implement as early as possible a set of data bases 
that collectively give the broadest possible coverage of the 
entire scientific and related literatures. 

Before we leave this matter of coverage two other fact- 
ors shr>M"!u be mentioned- First* it is importar.c chat a serv- 
ice covers regularly and consistently all the sources it claims 
to cover. In othtji word*;, the reliability of the coverage of 
sources is important. We must be able to rely on a service to 
cover all issues of a particular journal, all reports of a 
particular series, and so on. If we cannot rely on a service 
in this way (i.e., we find that it sometimes indexes; a part- 
icular journal and sometimes does not) its value is serious- 
ly degraded ♦ 

Another factor relates to the reliability of the serv- 
ice in selecting all articles, from the sources it claims to 
cover, that legitimately fall within the subject area of its 
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coverage. A comprehensive service such as the Science Ci- 
tation Index will include everything from the sources it 
covers, but other services, while they may cover some sour- 
ces comprehensively , will cover others selectively . Again , 
if we cannot rely on the service to pick up everything of 
relevance to its stated scope, its value will be degraded. 
Devon et al (23), for example, in comparing Ringdoc with 
CBAC, found that the latter covered 580 journals to the 
former ' s 332 but produced evidence that suggests that Ring- 
do c's selection of articles is better, at least in the 
pharmaceutical area. 

Stern (25), after disclosing that no single source is 
likely to give a comprehensive coverage of all literature 
relevant to the pharmaceutical industry , suggests that it 
should be possible to plan an information retrieval strategy 
in terms of the cost needed to obtain a particular level of 
nnvHXcige. It is possible to select one or more services 
which, in a particular subject area> will yield a given 
level of c<jverage (70%, 807a, 90%). An alternative strategy 
would be to select those services that are likely to give 
the best coverage for a specified amount of money available. 
Time Factors 

Another quality factor relating to a machine-readable 
data base is the extent to which it is up*to-date. The 
time lag between publication of a paper or rejijort and its 
appearance in an abstracting and indexing service will have 
a significant effect on the value of that service, at Ip^sr 



ERIC 



for current awareness purposes - In general , services in 
which human intellectual operations have been eliminated 
or at least kept to a minimum (as in the case of the Sci- 
ence Citation Index ) are likely to be more current, all 
other things being equal, than a service in which delays 
occur in human indexing or abstracting operations. Even 
two services based on human indexing may exhibit substan- 
tial difference in their degree of currency. Bourne's 
study of the Bibliography of Agriculture (20), for exam- 
ple, showed that on the average, agriculture-related lit* 
erature, included in this publication, appeared later than 
it did in eight of eleven services with which it was com- 
pared. Based on a sample of 617 citations published by 
both services, it was found that Chemical Abstracts , which 
includes abstracts in addition to the citation data provid- 
ed in the BiblioRraphy o f Agriculture , published 3.7 months 
e^.rlier on the average. 

To be judged valuable by a scientist, a current aware- 
atftjb service must not only inform him of icems relevant to 
his interests but it must also inform him of many of . these 
items before they are brought to his attention in any other 
way. In other words > a current awareness service must have 
a high novelty factor if it is to survive as a commercial 
venture- A service is unlikely to survive if, due to pro- 
cessing delays, it brings to the attention of its customers 
literature that they were mostly av/are of previously. In com- 
paring the suitability of a particular data base for provid- 
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ing current awareness services NASIC must give this factor 
of currency careful attention, A data base that is not. 
very current:, .at least for a major proportion of the sources 
it covers, should not be used for dissemination services, al- 
though it may still be quite valuable for retrospective search 
purposes. Given two data bases covering essentially che same 
subject area (e,g., MEDLARS and EXCERPTA MEDICA) the more cur- 
rent should always be given preference, all other things being 
approximately equal. 

In evaluating various data bases in respect to their time- 
ly coverage of pharmaceutical information, Ashmole et al (25) ^ 
discovered that the Science Citation Index data base (ASCA) 
averaged delays of only 0-3 weeks from time of publication, 
while most other services averaged 2-6 months, and Biologic- 
al Abstracts averaged 4-12 months. 

A related time factor to be taken into account is that of 
how frequently a machine-readable data base is updated. For 
current awareness purposes, a monthly update is probably the 
minimum that i$ acceptable, and an update every two weeks is 
preferable. For the purposes of retrospective search, fre- 
quency of update is somewhat less critical. Some suppliers 
of machine-readable data bases offer a differential pricing 
structure ba?ed on frequency of updating. That is, it is poss- 
ible to lease or license a H^*ta base more cheaply if the pur- 
chaser will accept less frequent updating- Savings of this 
type are unlikely to be justified if the data base is one in 
heavy use or if it is used as the basis of dissemination act- 
ivities , 
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Indexing and Vocabulary Consider ations 

These factors have already been discussed to some extent , 
The terms used to represent the subjt^ct matter of documents 
(whether there are natural language words in titles or ab- 
stracts, subject headings, descriptors, or codes from some 
classification scheme) must be specific. A nonspecific vo- 
cabulary will not allow the conduct of specific searches and 
a system based on broad terminology is doomed to a low level 
of precision.'^ Natural language vocabularies (i.e., uncon- 
trolled) tend to be quite specific. For example* a search- 
able natural language abstract* if well constructed, is like- 
ly to be a highly specific representation of the subject matter 
of a document. Data bases in which subject matter is represent- 
ed by index terms selected from a controlled vocabulary (e.g., 
a thesaurus) tend to be somewhat less specific. Thus, all 
other things being equal, a natural language data, base is like- 
ly to provide the capability for achieving a higher precision 
than a data base using a controlled vocabulary. The capability 
of searching very specifically is more likely to be of import- 
ance for retrospective search services (where some requests 

'^It should be borne in mind, however, that the specificity 
of an indexing vocabulary must be considered in relation to 
all terms assigned to a particular item. A term standing 
alone may be quite general, but it may take on a specific 
connotation when used in conjunction with some other term. 
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may be made for quite precise topics of limited scope) than 
it is for current awareness purposes (where users are usually 
seeking coverage of broader areas related to a wider range of 
current professional interests)- 

Although natural language is generally quite specific, 
a data base in which only natural language (e.g., titles and/ 
or abstracts) is used does present various other problems to 
the searcher. In a data base founded upon a controlled vocab- 
ulary, this vocabulary serves several important functions. It 
controls synonyms and near-synonyms, differentiates homographs^ 
and links together terms that are semantically related. The 
thesaurus » or other form of controlled vocabulary , thus norm- 
alizes the language used in indexing and provides a very val- 
uable aid to the searcher. Without some form of controlled 
vocabulary the searcher (or constructor of profiles) is "left 
to his own devices". It is up to him to think of all possible 
ways in which a particular topic might be represented ia iiat- 
ural language titles or abstracts (not an easy task) because 
the system gives him no help. Actually* this last statement 
is not always true. Some systems based on natural language 
issue some type of searching aid in which the substantive 
words occurring in the data base are listed and grouped in 
various ways that may be of use to the searcher. 

Semantic and syntactic ambiguities of the *'false coord- 
ination'^ and "incorrect term relationship" type are likely 
to be more prevalent in the searching of natural language 
abstracts than they are in the searching of data bases in 
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which subject matter is represented by index terms selected 
by indexers from a controlled vocabulary. 

A second, related quality consideration is that (prev- 
iously mentioned) of the exhaustivity of the indexing - the 
number of access points to subject matter provided. For com- 
prehensive (high recall) searches an exhaustively ''indexed'' 
data base is ^ necessity. A data base in which a very limit-' 
ed number of access points is provided (e.g., titles only or 
only 2-3 index terms) is unlikely to be capable of provid- 
ing a high level of recall at an acceptable level of preci- 
sion. O'Donohue (22), describing experiences with a number 
of search services, points out that "references that were 
overlooked in computer scanning usually had inappropriate 
or insufficient keywords". In this connection it is inter- 
esting to note that searches of CA Condensates tapes at the 
National Technological Library » Denmark, have consistently 
been able to operate only in the range of 30 - 407* precision » 
even when special efforts were made to reduce irrelevancy (21) 
as much as possible. 

A complete, searchable abstract is, of course, one form 
of exhaustive indexing. Multiple access points can be provided 
in a number of ways, including citation linkage. However, to 
ensure the high recall capability that niay be needed in a cur- 
rent awareness or retrospective search system, the subject 
matter of a document needs to be represented directly and 
exhaustively in the document record stored in the data base. 
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Other quality factors relate to the accuracy and consis- 
tency of the indexing. Some machine^readable data bases have 
been found to contain ^ high level of indexing error and/or 
inconsistency. These problems can only be recognized through 
experience in using a particular data base or through ascer- 
taining the experience of other data base users. 

While NASIC will need to make its own evaluation of var* 
ious data bases, in terms of the various criteria mentioned 
in this report and the recognised needs of the NASIC community, 
it can also draw upon the evaluations conducted by others. A 
few evaluations, some comparative, have appeared in print. Be- 
sides the study of O'Donohue (22), a comparative evaluation of 
Ringdoc and CBAC has been reported by Devon et al (23), and six 
different data bases were compared by Beauchamp et al (29) in 
terms of their yield in searching for literature on various 
chemical compounds. Scot et al (30) have reported on an eval- 
uation of the Drugdoc service of EXCERPTA MEDICA , and Ashmole 
et al (25) have described a comparative study of several data 
bases in terms of their "cost^effectiveness" in searching for 
pharmaceutical re ferences * 

A final group of considerations relating to the accept- 
ability of machinable data bases is concerned with reliabil- 
ity and ease of implementation. Most of the very large machine 
readable bibliographic files are produced as a by-product of a 
publishing operation. This creates the danger that the 
records may contain various elements that are needed for 
publication purposes but are superfluous or possibly even 



obstructive in the use of the file for literature searching 
purposes?. This problem of file "garbage" is less critical 
now than it was formerly since several publishers generate 
a separate "clean*' file for search purposes. Nevertheless, 
some machine-readable records still contain a considerable 
amount that is "garbage" as far as the information service 
center is concerned . 
Continuity 

A major consideration for any service center should 
be the degree to which the continuity of the data base is 
assured. Of course > many suppliers of data bases have a 
long record of successful publishing behind them and it is 
not likely that these enterprises will "fold" in the for- 
seeable future. In the evaluation of newer data bases* on 
the other hand, NASIC should pay special attention to the 
probable continuity of the file and should avoid the devel- 
opment of services relying on data bases whose future does 
not seem to be assured. Data bases compiled with the aid 
of funds supplied by government agencies , without other 
guaranteed sources of financial support > are especially sus- 
pect. As of this writing, for example, there is a strong 
move on the part of the National Institute of Neurological 
Diseases and Stroke to withdraw support from most elements 
of the Neurological Information Network, including the im- 
portant services of the Brain Information Service (UCLA) > 
which is estimated to actively serve some 20>000 neuro- 
scientists on a worldwide basis. 
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Ease Of Imp lementation 

Finally, in the case of data bases to be implemented 
by NASIC in-house, ease of such implementation must be con- 
sidered- Considerations here will include record format, 
code structure (ASCII , EBCDIC t etc , ) , tape packing density , 
tape tracks, and the presence of standard (e.g., OS) ^'labels". 
In other words, consideratioris here must relate to compatibil- 
ity or ease of convertibility to formats suitable for process- 
ing with NASIC hardware and available software, including the 
amount of pre-processing necessary to integrate the handling 
of a particular data base with others that NASIC may be pro- 
cessing. A related consideration is the amount of documen- 
tation the data base producer is able and willing to supply 
and his willingness to provide timely information on antici- 
pated changes to the contents and format of the files. 

Sturdivant (26), reporting experience at Marathon Oil 
Co., has confirmed that some data bases will be more diffi- 
cult to implement than others. He reports that NTIS files 
presented great difficulties in conversion to the conunon 
format adopted (API) because of "extremely complex coding". 
The files of the U-S- Geological Survey were also difficult 
to use. The GEO-REF files of the American Geological Insti* 
tute were easiest to re-format. 

In situations in which NASIC is able to acquire search- 
ing "software" from the supplier of a data base> the data 
base and the related software need to be evaluated together 
in terms of the searching capabilities that the software 
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provides. This takes us beyond the characteristics of the 
data base as such and into considerations involving the mode 
of acquisition and implementation of a service based on a 
data base. This type of consideration will be dealt with 
in the next section of this report. 
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D. ACQUISITION OF DATA BASES OR DATA BASE SERVICES: ALTER- 



NATIVES 



ERIC 



For any particular collection of data bases NASIC may 
offer services by any of the modes identified in Table 2. 
The major choice is the one between (a) acquiring a data 
base for in-'house use, and (b) acquiring services from that 
data base from a service center already in existence. In 
actual fact, a completely free choice does not really exist, 
at least for several data bases of possible interest. In 
the first place NASiC, under the terms of its agreement with 
NSF, has a commitment to exploit the use of existing academic 
information centers as much as possible. Moreover, where such 
services exist for a particular data base and are well estab- 
lished i it is unlikely that NASIC could compete economically 
with these services (i.e., in most cases it is unlikely that 
an in-house operation could offer services more cheaply chan 
those available through existing service centers)- However, 
because of the broad scope of its activities, NASIC will un- 
doubtedly Identify some data bases for use that are not hand- 
led by existing information centers- For these data bases 
NASIC may elect to offer services directly, through acquisi- 
tion of the data base for in-house searching. For this rea* 
son, then, it is worth giving some consideration to the en- 
tire range of "acquisition^' possibilities. 

Although a few data bases are available on a straight 
purchase or subscription basis, the majority of machine-read- 
able files must be leased or licensed. Generally, a leas- 
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TABLE 2 



MEANS BY WHICH DATA BASES OR DA3A BASE 
SERVICES CAN BE ACQUIRED 

NASIC obtains data base by purchase or licensing arrangement 
for in- house manipulation. 

a. Off-line 

b . On- Line 

NASIC purchases services from another agency: 

a. From the producer of the data base 

i. Off-line service 

ii. On-line service 

b. From an existing service center (retailer) 

i. Off-line service 

ii. On-line service 
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ing arrangement allows the recipient organization to offer 
services to its own staff members only. For example, a 
particular industrial organization may lease a machine-read- 
able bibliographic file in order to offer retrospective 
search and/or SDI services within the company. If* as is 
the case with NASIC, the recipient organization is to offer 
services to a wiuer community of users, on a fee basis* it 
must enter into a licensing agreement with the supplier of 
the data base. 

Although leasing and licensing arrangements and condi- 
tions vary somewhat from one data base producer to another, 
there is some commonality between them. In general, a data 
base is leased at a fixed annual fee. A licensing arrangement 
on the other hand, is based on a fixed annual fee plus some 
form of royalty arrangement related to amount of use. As 
an illustration we can consider the situation pertaining to 
the use of the COMPENDEX Tapes of Enginr;ering Index Inc. The 
current files can be licensed tor an annual fee of $6000. In 
addition. tUe licensee pays to Engineering Index $2.50 per 
year for each SDI profile it runs> up to the first hundred 
such profiles. After the first hundred profiles a sliding 
scale comes into effect, reducing to $1.70 per profile per 
year for each profile over 1000 serviced. For retrospective 
search purposes the arrangement is somewhat different. Four 
years of the data base, 1969-1972, can be licensed at an annual 
base fee of $21t600. The royalty for retrospective services is 
$2.00 per query per year of COMPENDEX searched. Alternatively > the 
licensee may pay a flat annual fee of $500 per year of COMPENDEX 
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searched* with no limit imposed the number of searches con- 
ducted. Where COMPENDEX is made available for on-line search- 
ing (as it is, for example, through Lehigh Un iver s i ty) the rov- 
alty claimed by Engineering Index is 10% of the charge paid by 
the user for each search. 

In the case of COMPENDEX then, the royalty arrangement 
is based upon the number of searches conducted, or profiles 
services, per year. Other data base suppliers may have a 
somewhat different basis for royalty assessment. The Inter- 
national Food Information Service (IFIS) royalties, for ex- 
ample, are based on the number of citations supplied to cus- 
tomers . 

Some additional points relating to leasing or licensing 
arrangements are also worth noting: 

1. Some suppliers of data bases, including the Insti- 
tute for Scientific Information, offer a year of 
"grace" as far as royalties are concerned. That is, 
royalties for outside use of a data base are imposed 
only after the first year of operation. 

2. Sometimes the subscription to a tape service is tied 
to a subscription to the equivalent printed index. 
That is, an organisation (lan subscribe t0 the magnet- 
ic tapes only if it also subscribes to the printed in- 
dex . If the subscription to the latter is cancelled, 
the tapes must be returned to the supplier. This is 
the situation, for example, in the case of tapes avail- 
able from the American Society for Metals and from the 
International Food Information Service. 
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Postage and cost of magnetic tape reels is normally 
charged in addition to the leasing or licensing base 
fee. 

Cost of leasing or licensing may vary slightly, de* 
pending upon how frequently the lessee (licensee) 
requires his file to be updated. He pays less if 
he accepts less frequent updates of his file. 
A leasing or licensing arrangement allows an inform- 
ation center to acquire a machine-read^ible data base 
and that only. Such an arrangement does not also 
provide searching software . If the data base sup- 
plier has such software available, this must be acquir- 
ed by a separate purchase. In actual fact, many data 
base suppliers do not have searching software avail* 
able. The lessee (licensee) must develop his own 
software or must acquire existing software from else- 
where (i.e.> from another service center). The ASIDIC 
Survey of Information Center Services (27) lists 24 
centers making software available to customers. 
It is unlikely that NASIC could obtain exclusive rights 
to provide service from a particular data base for the 
Northeast United States, although it is possible that 
for certain data bases it might be possible to obtain 
exclusive rights for the whole country (in the way that 
the United Kingdom Chemical Information Service holds ex 
elusive rights in that country for the products of the 
Chemical Abstracts Service), especially in the case of 
data bases of non-U. S, origin. 



Let us now tuin our attention to the acquisition of bib- 
liographic services from an existing supplier. Possible sup" 
pliers to be considered are enumerated in Table 3. For some 
data bases* only one avenue of access is open to NASIC. Ser- 
vice from the New York Times Information Bank* for instance* 
can be achieved only through a subscription arrangement allow- 
ing remote on-line access to the files maintained by the pro- 
ducer. In other cases* however* a number of possibilities for 
service exist. In the case of COMPENDEX* for example, the tapes 
can be obtained under a licensing arrangement for in-house pro* 
cessing* batch processing services can be purchased through 
the University of Georgia or the University of Pittsburgh (to 
name only two)* or the data base could be accessed remotely 
through the UCLA batch and on-line system through ARPA. In the 
case of some data bases* the choice may be one between the pur- 
chase of services directly from the producer (wholesaler) or 
the purchase of services from a middleman (retailer); some may 
involve a choice between the licensing of the data base or the 
purchase of service from the supplier of the data base; some 
may merely involve a choice between different middlemen (re- 
tailers) . 

In theory all of these options are open to NASIC* although 
in practice established policy constraints may limit the options 
available. Nevertheless* for certain data bases at least* NASIC 
will be faced with various service options and the general poss- 
ibilities should all be examined for this reason. Several major 
producers of data bases do not themselves offer services based 
upon these files. Notable examples are Chemical Abstracts Ser- 
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TABI£ 3 



POSSIBIX SUPPLIERS OF BIBLIOGRAPHIC SERVICES 

1. A data base producer offering batch processing services* e.g. > 
BIOSIS . 

2. The academic information centers, founded wit±i NSF support, 
offering batch processing services, e.g.. University of 
Georgia. 

3. Other scientific information dissemination centers offering 
batch processing services, e.g., IITRI. 

4. A data base producer offering direct on^'line access, e.g. , 
The New York Times . 

5. An academic information center offering on-line access, e.g. , 
Stanford. 

6. Other licensees offering on*- line access to data bases, e.g. , 
Lockheed* System Development Corporation. 
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vice and Engineering Index inc. The choice here is between 
acquiring the data base through a licensing arrangement or 
purchasing service from an existing service center. W here 
service from a particular data base is already available 
from a service center » this mode of access will normally be 
the preferred one . NASIC is expected to capitalize on exist- 
ing centers as much as possible. Moreover, it is unlikely that 
NASIC would offer services more economically as an in-house op- 
eration ^ unless the volume of searches conducted by NASIC was 
large enough to reduce the unit cost per search to quite a 
small figure. Even in this situation, however, in-house oper- 
ation is not necessarily more economical because NASIC could 
undoubtedly negotiate with an existing service center to achieve 
greatly reduced rates for a large volume of business. It i,3 not 
possible to lay down any hard and fast figures here. For any 
particular data base in which this choice is involved NASIC would 
need to: 

1. Estimate annual demand on the data base (both in tierms 
of retrospective searches and SDI profiles) ; 

2. Calculate costs of licensing the data base, acquiring 
the software necessary to search it (e.g., through pur- 
chase from another service center) > and operating the 
data base on equipn;ent available to NASIC. 

3- Divide these costs by the projected demand in order to 
arrive at an estimated unit cost per retrospective 
search and an estimated unit cost per SDI profile per 
year . 

4. Enter into negotiations with existing suppliers in order 

O 
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to obtain a unit cost quotation for the estimated 
annual volume of NASIC business* 

5* Compare the most favorable costs from these service cen- 
ters with estimated unit costs for in-house operation. 

In certain cases a choice may exist between purchase of 
service from a service center and purchase of service direct- 
ly from the producer of the data base. Here cost comparisons 
must again be made* It is not unlikely thac in this situation 
the most cost-effective ^,:->;^i.oach would be to deal directly 
with the data base producer. For example* BIOSIS offers an 
SDI service, Current Literature Alerting Service (CLASS), based 
on tapes of BA Previews at a cost of $120 per annum per profile. 
This service purchased through the North Carolina Science and 
Technology Research Center costs $200 per annum; purchased 
through IITRI> it costs $250 per annum. These figures were der* 
ived from the ASIDIC Survey of Information Center Services > coxa- 
piled by Williams and Stewart (27). 

For some data bases* especially those that are relatively 
small and highly specialized* the only method by which NASIC 
can offer service may be to acquire the data base (by purchase 
or license) and provide service directly from the NASIC center 
(i.e.* situations in which the supplier of the data base does 
not offer service and no service on this data base is now offer- 
ed by existing service centers). In other cases* no service 
On a particular file is available through a service center and 
the choice lies between acquisition of the data base for in-house 
use or purchase of service directly from the producer. This 
choice must again be based on cost estimates and it will be Vol- 



ume* dependent- As an example, consider the Comprehensive Data 
Base of Patents (Chemical) available from IFI/Plenum Data Corp. 
This data base, of about 283,000 chemical patents going back 
to 1950, can be purchased (together with appropriate search- 
ing programs) for $34,000, plus $16,000 per annum for each 
yearly update. Over a five year period, with four annual up- 
dates, this data base would cost $99,000 to acquire, or about 
$20,000 per annum, exclusive of in-house operating costs for 
loading the data base and running searches. Only with a vol- 
ume of retrospective search demand well in excess of 200 sear- 
ches per year would it be likely that NASIC could operate this 
data base in-house at a lower unit cost per search than it could 
purchase service from the IFl/Plenum service bureau ($150 per 
search) . 

On the other hand , some of the smaller specialised data 
files may be operated in-house by NASIC more cheaply on even a 
relatively small volume of demand. For example, the bibliograph- 
ic tapes of the Crystallographic Data Centre (University Chemi- 
cal Laboratory, Cambridge, England) and the searching programs 
to interrogate them can be acquired for an initial cost of about 
$3000 and annual update costs in the region of $200. Over a five 
year period comparatively few searches may be needed to make in- 
house operation less expensive than the purchase of service from 
Cambridge at a minimum cost of around $25 per search. 

It seems likely that NASIC will operate in a number of dif- 
ferent modes. In fact, when fully operational, all the access 
modes of Table 2 may be used. The majority of data bases are 
^ likely to be accessed though existing service organisations, in 
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both on-^line and off-line modes, some others will be accessed 
through the data base producer (both on-line and off-line), 
and a few may be processed in-house by NASIC itself. One 
problem that NASIC must face is the problem of choosing be- 
tween various service centers, where a particular data base 
is available through several such centers- Some guidelines 
relating to this choice are given in the next section of this 
report , 
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E. GUIDELINES ON THE C HOICE OF SERVICE CENTER S 



Given that a number of different suppliers of services 
exist for a particular data base (SDI service based on Chem- 
ical Condensates is, for example, available from ARAC, IITRI. 
the University of Pittsburgh, UCLA, the University of Georgia, 
the North Carolina Science and Technology Research Center, the 
American Petroleum Institute, and the Institute of Paper Chem- 
istry), what guidelines can be used by NASIC to help in decid- 
ing which information center to use? For certain data bases 
an important consideration will be the decision as to whether 
to acquire access on a batch processing basis or in an on-line 
mode. In the following paragraphs some considerations relat- 
ing to the choice of service center are discussed. The fact 
that, as a matter of policy, NASIC may wish to give preference 
to the six academic information centers previously mentioned 
is tacitly assumed and will not be considered directly in the 
discussion. 

Before we can discuss the comparison or evaluation of 
various service centers, however, it is necessary for us first 
to consider by what criteria information services are evaluated 
by their users. The most important of these criteria are enum- 
erated in Table 4. In general, any service or product is judg- 
ed in terms of cost, time and quality factors* An information 
service is no different from other types of service in this re- 
spect. The service must be provided at a cost that the user 
feels is reasonable in relation to the benefits associated with 
it. Cost to the user involves more than direct charges. It in- 



TABLE 4 



CRITERIA BY WHICH USERS EVALUATE 
IMPOPMATION SERVICES 

Cost 

Direct charges 

Effort involved in use 

Ea^e of interrogating system 

Form of output provided 

Backup document delivery capability 
RespcFDse Time 
Quality Considerations 

Coverage (conpleteness) 

Recall 

Precisian 

Novelty 

Accuracy of data 

Cos t-E f f ec t ivenes s (a quality consideration) 

Cost per relevant document or reference supplied. 

Cost per new relevant document or reference supplied (i.e. 
novelty-cost ratio) . 
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eludes the cost of his own time, i.e., how much effort is in- 
volved in use of the system. Studies of the information seek- 
ing behavior of scientists and other professionals have consis- 
tently shown that accessibility and ease of use are the pi^ime 
factors influencing choice of an information source- In gen- 
eral, the most convenient source of information will be chos- 
en, whether or not this source is perceived by the user to be 
the most comprehensive, authoritative or, in some sense, the 
"best". Ease of use factors include ease of interrogating the 
system in the first place (i.e., ease of making one's needs 
known) and ease of use of the product provided by the system 
(i.e., the form of output supplied), A very important facet 
of the latter is the availability of an efficient and conven- 
ient document delivery capability. A service that delivers 
bibliographic citations goes only part of the way toward sat- 
isfying an individual's information needs- Such a service 
causes considerable frustration if the user is unable to obtain 
the documents cited or can only do so through procedures that 
he views as inconvenient and time-consuming. 

The users of information services may be viewed as having 
four major types of information needs: 

1* Specific factual information of the type that might 
come from some type of reference book or from a 
raachine-readable data bank (e.g., thermophy sical 
property data on a particular subfitance) , 
2, A few ''good*' articles (or references to them) on a 

specific topic. 
3- A comprehensive literature search in a particular sub- 
ject area. 
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4. A current alerting service whereby the user is kept 
informed of new literature relevant to his current 
professional interests , 

These different needs have different response time require- 
ments associated with them. The requirement relating to the 
current alerting service is that it should deliver regularly 
and frequently and that the information supplied should be as 
current as possible. The user needing a comprehensive liter- 
ature search is usually a person engaged in a relatively long- 
term research project. Speed of response is usually not crit* 
ical to him (except that there may be some date beyond which 
the search results will have no value or » at leasts greatly 
reduced value); he is willing to wait longer in order to achieve 
completeness (i.e., completeness is more important to him than 
speed). For the other types of information needs, on the other 
hand, the user will generally want rapid response^ in fact the 
type of turnaround time that is usually associated only with on- 
line systems . 

Beyond cost^ effort and time factors » the user will also 
be concerned with the quality q;E the product provided. The 
quality considerations of most concern to him are the com* 
pleteness of the data base (i.e.^ coverage ) > the completeness 
of a particular search (i.e.^ recall ) > the degree of relevance 
of the search results (i.e., precision ) , the novelty of the 
results (important in the case of a current awareness service » 
which is really only valuable if it brings things to a user's 
attention before he learns of them by other means) » and the 
accuracy of the results (which is a quality factor related to 



data retrieval systems rather than bibliographic systems). 

All of these user criteria must be borne in mind by NASIC 
in planning its services and operations. It is clear that these 
criteria influence many decisions that must be taken. Some of 
them are related primarily to data base characteristics and in- 
fluence choice of data base, as discussed earlier in this report. 
Other criteria relate to NASIC's own organization and facilities 
(e.g., ease of use, document delivery backup), while yet other 
performance factors will influence choice of centers providing 
service to NASIC. It is these that we are primarily concerned 
with at this point. 

In choosing service centers NASIC must seek to identify 
those delivering the highest quality of product with the least 
processing delay and at the least cost. Unfortunately, these 
requirements tend to be conflicting. We must usually pay a 
higher price for quality and we may have to wait longer to 
achieve it. 

A major decision that NASIC must face for several data 
bases is the decision as to whether to access the data base 
through a center offering off-line, batch processing services 
or through a service organization (e.g., Lockheed, SDC) making 
the data base available for direct on-line interrogation from 
remote sites. For some data bases there is no choice available. 
The New York Times Information Bank, for example > can only be 
accessed on-line, and the off-line IVIEDLARS operations are now 
almost entirely phased out in favor of MEDLINE, To return to 
the four types of information need noted earlier, the current 
alerting activity is one which is probably handled most effect- 
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ively by off-line, batch processing (except that the construe^ 
tion of user interest profiles is probably best handled by heur 
istic interaction with an on-line system). The comprehensive 
literature search is also best handled in the off-line mode> 
where no stringent deadlines are imposed by the user, although 
again there are obvious advantages associated with being able 
to test out a strategy on-line on part of a- data base before 
committing this strategy to the more expensive search of the 
complete file. For the other types of information needs, how- 
ever, the response time requirements of the user are likely to 
be such that only rapid, direct access to the on-line data base 
will be acceptable. It seems> then, that NASIC must consider 
both off-line and on-line suppliers of service and that for 
some data bases both modes of searching must be made available. 
NASIC will, then, at some point in time, find itself in the po- 
sition of evaluating on-line service centers as well as off- 
line service cencers . 

Parenthetically it is worth pointing out that on-line 
access to a data base, through a service center, may well be 
cheaper for a large volume of demand than off-line re trospec- 
tive searches, as well as offering more rapid response and the 
possibility of improved results through the interactive nature 
of the search. To illustrate this, consider the services offer 
ed by Lockheed Information Retrieval Services > as depicted in 
Table 5. Assume the highest level of fixed costs, namely $720 
a month ($420 a month for rental of high speed display/printer 
plus $300 a month for communications costs), that the capital 
outlay on installation and service is amortized over two years 



TABLE 5 



ON-LINE INFORMATION SERVICE AVAILABLE FROM 
LOCKHEED INFORMATION RETRIEVAL SERVICES 



DATA BASES AVAIIABLE 

National Technical Information Service (NTIS) 

ERIC - Research in Education (RIE) 

ERIC - Current Index to Journals in Education (CUE) 

Exceptional Children Abstracts 

PANDEX 

POSTS 

Installation and service (one-time cost) $500 to $1000 



1. 
2. 

3. 
4. 
5. 



Rental of high speed display/ printer 
terminal operating at 240 CPS 

Connunications costs 

Conputer charges 

Off-line printing cost 



Nominal royalty charge (for certain 
data bases only) 



$420/iionth 

$200-300/TrDnth 

$25-35/hour 

1(K per item 
(citation or abstract) 
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for about $30 a month, and that the average search takes 15 
- 20 minutes on-line and produces a printout of 50 citations 
or abstracts* Assume a volume of demand, over all five data 
bases, of fifty searches a month (not an unreasonable level 
of demand for such data bases within the Northeast academic 
community as a whole)- The cost per search would then be 
about $30 ($750/50 = $15 plus $10 per search computer costs 
plus $5 per search printout costs)* This compares with a 
figure quoted by the University of Georgia of $35 per volume 
for a retrospective off-line search of RIE and $35 per volume 
for a retrospective search of CUE* The same figure is quoted 
for a search through the Government Reports Announcements (NTIS) 
files . 

Remember also that these results put the on-line service 
in the worst possible light as far as costs are concerned be- 
cause the full cost of the terminal rental and conununications 
is charged to the use of only five data bases. If Lockheed 
adds additional data bases (or if the terminal is also used 
to access other data bases elsewhere) and the cost is then 
spread over a larger volume of usage, the cost per search for 
on-line access could be reduced considerably. With ten data 
bases accessible from the same terminal, and with a volume of 
use on all ten data bases of 200 searches/month, the cost per 
search could be $20 or less, (Tables describing the main fea- 
tures of a number of available on-line systems are presented in 
Appendix 2 of the Attachment) * 

Let us now return to the user-oriented evaluation criteria 
of Table 4 and give some further consideration to their im- 
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plications for the selection of service centers. Since NASIC 
is to be a self-supporting operation, cost considerations will 
be of major iinportance. Centers must be evaluated in terms of 
the cost of their services and because costs of information 
services tend to be volume-dependent, NASIC must endeavor to 
negotiate the best possible terms for the projected volume of 
business it will generate on a particular data base. For ret- 
rospective search purposes the cost of dealing with a center 
operating in the off*line mode must be compared with the cost 
of accessing data bases from terminals located within a NASIC 
center. In fact, this cost analysis may well be the most im* 
portant one. The difference in charges among the various off- 
line centers Is usually not very great > at least for several 
of the data bases. 

These charges are laid out conveniently in the ASIDIC Sur * 
vex (27) of 1972, although these data are now a little out*of* 
date. As an example, the North Carolina center quotes a cost 
for SDI service on the ERIC tapes of $75 per profile per year. 
The Georgia center quotes a cost of $80 for the same service. 
On the other hand, for an SDI service based on CA Condensates, 
IITRI quotes a flat rate of $250 per profile per year, while Geor 
gia quotes $260 for academic customers and $364 for commercial 
customers. In contrast, however, the Aerospace Research Appli- 
cations Center (ARAC) quotes a charge of only $195 for this 
same service. 

The response time requirement relates only to retrospec* 
tive searches. In assessing this feature of the various cen* 
ters, NASIC will need to contact a representative group of 
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existing customers of these organizations. The ASIDIC Survey 
indicates that some processing centers operate with a short 
turnaround time in the range of 1 - 5 days, while others offer 
a service much less satisfactory, up to 20 days in some cases. 
Unfortunately, these data are presented in gross form only, 
the individual centers not being identified in the ASIDIC tab- 
ulation. In the O'Donohue (22) survey, seven centers were com- 
pared on processing time for retrospective searches. Only three 
of these centers routinely processed searches in a time which 
O'Donohue regards as **prompt**, namely less than two weeks. 
While analyses of this type are difficult to find in published 
form, other customers undoubtedly have made their own compari- 
sons and data of this type may be made available to NASIC. 

Somewhat related to the response time for retrospective 
searches is the time it takes a service center to get an SDI 
profile "up and running*'. 0*Donohue (22) quotes a range of 
from 2 weeks to 6% weeks (from initial inquiry to first machine 
printout) for five processing centers from which SDI service 
was received. 

Before leaving the subject of response time it is worth 
noting that NASIC must endeavor to identify centers that are 
willing and able to handle certain requests in a ''special process- 
ing" mode. That is, they should be capable of handling a special 
request on a "rush" basis where necessary. 

While it is easy to compare centers in terms of their 
charges, and relatively easy to compare them from the view- 
point of response time, it is not at all easy to make this com- 
parison in terms of the quality of the product provided^ the ma- 



jor qualitative considerations that are at least partly con- 
trollable by a service center being the recall and precision 
of search results (see Table 4). This statement needs quali- 
fication, however. It is relatively easy to judge the per- 
formance of a center, at least in terms of the precision of 
its searching, through a period of experience with this cen- 
ter. It is difficult, however, to compare centers in terms 
of what their service is likely to be (i.e,, before a contract 
is actually initiated). However, because NASXC is likely to be 
a very substantial customer it seems reasonable that it should 
negotiate a trial period for a number of profiles, with one or 
more service centers, before a formal subscription is placed 
with a center. In fact, NASXC might consider developing a 
small group of "test searches" (i-e., searches for which a 
known set of relevant documents is identified within a partic- 
ular data base) and use these test searches to evaluate various 
processing centers in terms of both the recall and precision of 
the search results, as well as response times. O'Donohue (22) 
quotes some precision figures (''percent relevant") for SDI ser- 
vice from several centers and based on several data bases. These 
range from a low of 4% precision (the Georgia center searching 
the Nuclear Science Abstracts Tapes) to a high of 547o (XXTRX 
searching CA Condensates Tapes). Xt should be noted, however, 
that while there is likely to be a certain minimum level of pre- 
cision that is acceptable to a particular user, an extremely high 
level of precision would suggest that the profile is missing many 
of the relevant documents (i.e., recall Is low) because there 
tends to be an inverse relationship between recall and precision 
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in searching. For example > if a user interest profile con- 
sistently operates at 807o precision we can be almost certain 
that It is also running at a very low rate of recall. For 
both the current awareness need and the "coinprehensive search" 
need> high recall Is likely to be more important to the user 
than high precision (although precision below a certain level 
may be intolerable). It must also be recognized* however, that 
while a user is able to judge the precision of a search (i.e,> 
determine what proportion of all citations delivered are rel- 
evant to his interests)* he is usually in no position to be 
able to judge its recall because he does not know what the 
search niay have missed. The recall ratio achieved for a partic-- 
ular SDI profile or a particular retrospective search can usually 
only be estimated by a specially devised test and analysis. ^ 

If we return now to a further examination of Figure 2> in 
which the entire range of factors governing recall and pre<;ision 
of a delegated search are displayed* it can readily be seen that 
several of these factors are associated with characteristics of 
the data base itself (i.e., indexing and vocabulary character- 
istics) and are essentially outside the control of the service 
center processing the data base. However^ two very important 
factors influencing performance are within the control of the 
center* namely the quality of the interaction with the user 
(procedures by which his information need is "negotiated" with 
the system) and the quality of the searching strategies used* 
whether for SDI or retrospective search. 

These performance factors are closely related to the pre- 
cise modes of operation adopted by NASIC. Three broad modes of 
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operation appear possible: 

1. The scientist to be served is put into contact with 
the service center. Staff at this service center 
clarify his needs and prepare the user interest pro- 
file or strategy for a retrospective search. 

2. The iscientist discusses his need with a NASIC repre- 
sentative (e.g., an Information Services Librarian) 
who then relays his interpretation of this need to 
the service center where a search strategy or profile 
is constructed. 

3. The scientist discusses his need with a NASIC reprta- 
sentative who converts this need into a searching 
strategy or user interest profile that is then run at 
a service center. 

In general, the second of these alternatives is the least 
desirable since the more intermediaries placed between a user 
and a data base, the less successful the search is likely to 
be. It is well known that when a message is relayed through a 
chain of people the possibility of distortion ("noise** in a 
communication sense) exists at each step in this chain. The 
first alternative is likely to produce the best results initial- 
ly because of the experience that personnel at a service center 
will have accumulated in use of a particular data base. How- 
ever, this mode of operation provides no training possibilities 
for NASIC staff. In the long run, the third alternative may be 
the best mode to adopt. Once NASIC staff have been trained to 
construct searching strategies or user interest profiles for a 
particular data base and set of searching programs, the fact 
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that these staff members are *'closer" to the user community 
(physically if nothing else) may result in improved user- 
system interaction and improved information products as a 
result . 

Of course, this presupposes that the service center dealt 
with will allow NASIC to operate in this way. Tf the third of 
the alternatives is the one that is organisationally most accept- 
able to NASIC, an important selection criterion will obviously 
be whether or not the service center will permit a mode of op- 
eration in which search strategies or interest profiles are pre- 
pared by NASIC staff and simply *'run" (in a machine sense) at the 
center. This mode of operation also presupposes that the center 
has available adequate facilities and materials for training 
people in search techniques. A center that has no training 
program and no adequate search manuals (which should be data- 
base related) will be unacceptable to NASIC if the third of 
the processing alternatives is the preferred one. It should 
be pointed out that some processing centers have produced very 
excellent searching guides and searching tools, both of a gen- 
eral nature and related to particular data bases- An example 
is IITRI, which has produced an excellent Search Manual > with 
supplemental guides for particular data bases, as well as a 
very useful word truncation guide. The University of Georgia 
has likewise prepared a very complete Profile Coding and Man* 
agement Manual > conducts regular workshops in profile construc- 
tion techniques, and provides on-line access to representative 
subsets of various data bases for training purposes. This train- 
ing program has been ciescribed by Park et al (28). 
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As far as the on-line approach to data bases is concerned, 
one assumption is that NASIC personnel will search these data 
bases far users hy means of terminals located in a NASIC cen- 
ter. In this case it is clearly imperative that NASIC staff 
or representatives (ISL's) be well trained in the search pro- 
cedures associated with a particular system and the search tech- 
niques needed to exploit a particular data base effectively. If 
NASIC deals with an on-line service center to gain access to var 
ious data bases, this center should he capable of providing the 
necessary training, as well as appropriate searching aids. 

As previously mentioned, the factors affecting performance 
of an information service that are not primarily input-related 
(indexing and vocabulary factors), and thus outside the direct 
control of NASIC or some service center, relate to interaction 
with the user and the quality of , searching strategies. In other 
words, the quality of the service will be heavily dependent upon 
the quality of the information staff who are interacting with 
the users and the quality of the information staff preparing 
search strategies or interest profiles (whether these be NASIC 
staff TOembers or personnel associated with a processing center), 
the training of these staff members, and their degree of ex- 
perience with particular data bases and particular searching 
software . 

Clearly, however, we have just identified another import- 
ant variable affecting the performance of an information service 
and thus the choice of a processing center, namely the capabili- 
ties of the searching software- In general, different service 
centers, both those operating on-line and those operating off^ 
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line* are using different searching software. The IITRI 
searching programs are different from those used in Georgia, 
the^ Lockheed programs (DIALOG) are different from those used 
by SDC (ORBIT) > and all of these are entirely different from 
the LEADER system of Lehigh. 

In assessing the capabilities of a processing center* NASIC 
must assess the capabilities of the search programs in use by 
that center as well as the output options available from these 
programs. This will be particularly critical in relation to 
data bases that NASIC staff members interrogate directly them- 
selves (on-line or off-line). While all such programs have the 
same general obj ectives ind r.spabilities* there are differences 
between them at the specific feature level. The off-line search 
programs > particularly those searching text* must go beyond sim- 
ple Boolean AND> OR and NOT capabilities. Weighted term search- 
ing (permitting the ranking of output) is an important capabil- 
ity and nested search logic is an essential requirement for sys- 
tems operating exclusively in a Boolean search mode. For text 
searching* word truncation capabilities (both left and right 
truncation) are essential and word proximity operators (i.e.» 
the ability to specify how close two words should be in text 
before they are considered to be related) are highly desirable . 
A number of output formats should also be available, both in 
terms of what is printed (citation, abstract, etc-)* in what 
sequence it is printed (i.e. , sorting options) and on what it 
is printed (i.e.^ the output medium)-. 

In the evaluation of on-line searching systems additional 
^ requirements become important . These requirements include the 
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capability of developing term lists, thesauri, or other search^ 
ing aids, and the capability of providing various tutorial and 
**help** features to the searcher . 

Fortunately for NASIC, much of the work on the comparison 
of searching software, as well as output capabilities, has al* 
ready been done. The ASIDIC Survey (27) has compared the search 
and output features of the off- line processing centers . includ* 
ing full members, associate members and non-members of ASIDIC. 
These data are presented in gross form in Table 6-8 and in 
a more exact form (i.e., each feature associated with each cen- 
ter) in Tables 9 - 11. 

The comparison of on*line searching systems has been done 
by Stanford University, Institute for Cominunication Research, 
under a grant from the National Science Foundation. The fea- 
tures of the major on-line searching systems were discussed and 
tabulated at an important meeting that took place at Stanford 
in April, 1973. One of the authors of the present report, 
F. W. Lancaster, attended this meeting and serves as an advis- 
or to Stanford on this project. As a result of the meeting he 
prepared a full report entitled The Present Status of On-Line 
Interactive Retrieval Systems . This report contains a complete 
summary of the searching, output, training, tutorial and monitor- 
ing features of the major extant on-line i>earching systems. Be- 
cause of its potential value to NASIC it is included intact as 
Attachment 1 of this report. 

Before leaving this subject of service centers and their se* 
lection, some additional comments need to be made. Selection of a 
service center will also involve considerations of experience . 
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TABLE 6 



CURRENT AWARENESS SEARCH SOFTWARE CAPABILITIES 



Number of Information Centers 



System Full 

Feature Members 

Weighted profile 

terms 16 

Truncation: 

Left 13 

Right 22 

Both simultaneously 13 

No truncation 3 

AND, OR, & NOT 

logic operators 22 

Proximity 

logic operator 6 

Nested logic 

(((()))> 13 



Associate 
Members 



8 
9 
8 
8 

14 
3 

11 



Non- 
Members Total 



26 



6 

10 

6 

3 

12 
1 

9 



27 
41 
27 
14 

48 

10 

33 
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TABLE 7 



OONTEKT OF CURRENT AWARENESS OUTPUT 



Information 
Printed 

Abstracts 

KeywDrd(s) 
and/ or index 
terTD(s) 

TerTD(s) caiising 
hit 

User name with 
each citation 

User ID number 
with each 
citation 



Number of Information Centers: 

Full As soc iat e Non- 

Menibers Members Merrbers Total 



20 

20 
12 
9 

19 



12 

8 

7 

8 



10 

9 
5 
3 



36 

41 
25 
19 

32 
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TABLE 8 



MEDIA FOR CURRENT AWARENESS OUTPUI 



Type of 

Output 

Available 

Cards; 1-part: 

3" X 5" 
3-V' X 7-3/8" 
4" X 6" 
4" X 9" 
5" X 8" 

Cards; 2-part: 

3" X 5" 
3-V' X 7-3/8" 
4" X 6" 
4" X 9" 
5" X 8" 

Paper Listing: 

2-%'- X 7-%" 

7- k" X 11" 

8- %" X 11" 
11" X 14" 
14" X 17" 

Multiple copies 
available 

CCM 

Tape 

Multilith masters 



Full Associate 
Marbers Metrbers 



0 
2 
3 
0 
2 



1 

9 
0 
0 
0 



1 

0 
12 
5 
0 

12 
6 
8 
2 



0 
2 
1 
0 
3 



0 
0 
0 
0 
0 



0 

1 

6 

11 

1 

10 
0 
2 
1 



Non- 
Members 



1 
0 
2 
1 

0 



0 
0 
0 

1 

0 



0 
0 
9 
6 
0 

7 
1 
1 

2 



Total 



1 
4 
6 
1 
5 



1 

9 
0 
1 
0 



1 
1 

27 
22 
1 

29 
7 

11 

5 
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TABLE 9 
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SUMMARY OF lUFORMflTlOU CEHTER SEARCH ANO OUTPUT CHARACTERISTICS 
ASIDIC "O" HEMBERS 



proven rclLability. flexibility of operation (e-g,, the abil- 
ity to accotntnodate high priority searches on a **rush"bar!Ts) . as 
well as general attitude and **cufltonier orientation*' (the ^^perpion- 
al element'' in 0*Donohue*s evaluation). The experience of the 
information center is important, especially its experience with 
particular data bases. Experience is related to years of oper- 
ation and to volume of profiles and/ or searches handled. These 
data are readily gleaned from the ASIDIC Survey, Clearly, NASIC 
should deal only with centers that appear to be stable and whose 
continuity appears to be assured. Another element to be consid- 
ered is the degree of interest that the center exhibits toward 
the quality control and improvement of its products. One form of 
evidence of this is the amount of interaction and iteration a 
center will undertake before it "stabilizes'* a profile. Another 
form of evidence is the amount and type of feedback and evalua- 
tion solicited from users, and the degree to which the center 
attempts to improve its performance on the basis of such feed- 
back and evaluation. Some gross information on solicitation of 
feedback is presented in the ASIDIC Survey. The degree to which 
user citation is used as a means of improving overall performance 
is also evidence of a center's interest in the quality of its 
produc t . 0' Donohue ' s survey is a useful summary of experience 
with a small number oi: centers. It is important to note that 
3ome centers satisfied the evaluation criteria of this £?tudy 
very well while one or two others were highly unsatisfactory. 
0' Donohue concluded that **great care is required in the selec- 
tion of commercial information services. The spectrum of poten- 
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tial satisfaction is wide and the user must analyze his needs 
and his supplier:; * capabili ti es carefully to optimize results'' * 

Ultimately a user will judge an information service on 
cost-effectiveness grounds, relating the cost of using the ser- 
vice to the quality of the product provided by that service. 
The most useful cost-effectiveness measure to use m the eval- 
uation of information services is the cost per relevant cita- 
tion obtained from the service. If a user subscribes to an SDT 
service, at an annual cost of $150, and is supplied with seventy- 
five relevant citations in a particular year, the cost per rel- 
evant citation is $2.00. When fully operational NASIC muJt de- 
velop quality control and monitoring operations that will permit 
the cost-effectiveness evaluation of data bases, and iservices from 
them, in terms of this important measure. This will require the 
development of procedures for obtaining regular and precise feed- 
baqk from users. Perhaps the most complete evaluation of data 
bases and services yet conducted, although restricted to drug'-re- 
lated information, is that reported by Ashmole et al (25). These 
inves tigators compared various approaches to locat ing information 
on a particular drug in terms of the yield of each source^ the 
number of unique references supplied by each source, the source 
that disclosed a particular reference for the first time (novelty) > 
and cost per relevant citation. 

In conclusion^ it seems likely that the NASIC operation, when 
fully implemented, will involve a number of service modes. Some 
data bases NASIC may choose to acquire and search in-house» on- 
line or off-line, while others will be accessed through service 
cent-ers- It is likely that a number of service centers will be 
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used for different data bases and that, to serve different 
purposes, the same data base may be accessed both through an 
on-line processing center and an of f --line centf^r , This sec- 
tion of the report haa attempted to present various guidelines 
and criteria to guide MASIC in tUfi^ choice of centers with which 
t:o deal . Although machine-readable data bases have been assumed 
throughout this discussion, it must be remembered that some 
unique data bases exist in manual form in various parts of the 
world and that both SDI and retrospective search services are 
available from centers without mechanized operation. An im- 
portant example is the Scientific Documentation Centre, Dunferm- 
line, Scotland, which operates a unique service in the field of 
spectra and spectral data. It is important that NASIC seek out 
such centers and, where appropriate, integrate these services 
into the overall NASIC operation., 
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F- SOME CONSIDERATIOIMS RELATING TO NASIC PRODUCTS AND SERVICES 



In general, as discussed earlier, information services are 
of two broad types : 

1 . Current awareness services (for alerting purposes) 
such as SDI. 

2 . Retrospective search services (on-demand) . 

Both types of service can be provided from a particular data 
base and both may be offered by a particular service center. 
Retrospect:ive searches may themselves be divided into two 
types : 

a. Quick-reference search to find a particular item of 
data or a "few good references" on a specific topic. 

b. Comprehensive search to find all references on a part- 
icular subject, or all references published on this sub- 
j ec t in a particular period. 

The first of these requirements can usually be satisfied by a 
relatively short search but this usually requires the rapid 
response that is most likely to be satisfied through a search 
of printed tools or machine data bases available on-line. The 
second type of search is more time-consuming but is usually asso- 
ciated with a much less stringent response time requirement. This 
type of search is more suitable for cff-line processing in large 
machine-readable data bases, although the strategy might be test- 
ed initially on a portion of the data base available on-line. 

It seems likely that i n its implementation schedule HASI C 
sh ould give first priority to the SDI aspect of its serv i ce . There 
are a number of reasons for this. An SDI service is generally more 



coiTimercially viable than a retrospective search service, it 
is likely to have a wider initial appeal, and it is a service 
for which volume of demand is easier to assess. In contrast, 
it tends to be quite difficult to estimate the likely pattern 
of demand for retrospective searches from a particular data 
base. Moreover, because an assured minimum annual volume can 
be determined for SDI service from a particular data base, it 
will be possible for NASIC to negotiate favorable rates from 
a processing center. Likewise , the costs of licensing a data 
base, purchasing software to search it, and offering service 
as an in-house operation, are much more likely to be recovered 
through a regular current awareness service than through a ret- 
trospective search service. The latter is only likely to be ec- 
onomically viable if there is a fairly heavy and constant level 
of demand for service , In the case of an in-house batch process- 
ing service* the volume of demand must be such that batches of 
specified minimum size can be run on a regular basis, say once 
a week. 

In the case of SDI service* it seems desirable, for reasons 
discussed earlier, that NASIC information services librarians be 
trained in techniques of profile construction and be given respon 
sibility for development and updating of profiles, iwhether a data 
base is operated in-house by NASIC or through a service center. 
For certain data bases used for SDI purposes, NASIC may also 
have on-line access. In this case, an initial user interest pro- 
file can be developed by direct interaction with the data base 
until it is refined to the point at which it can be submitted for 
regular batch processing. 
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The use of an on-line system directly for SDI purposes 
is also possible, a good example being NLM's SDILINE. On-line 
SDI may play some part in the overall pattern of NASIC opera- 
tion, but the use of this type of service is likely to be re- 
stricted to certain large academic libraries within the NASIC 
community, these libraries having their own on-line access to 
various data bases. 

Group SDI tends to be considerably cheaper than completely 
customized SDI to an i-ndividual. By "group SDI" we ir.ean serv- 
ice, based on the same profile, to a group of users having com- 
mon, homogeneous interests . Several service organizations offer 
some form of group SDI. A good example is the "macroprof iles" 
service of the United Kingdom Chemical Information Service. An- 
other is the ASCATOPICS service of the Institute for Scientific 
Information. A service of this type is also offered by IITRI 
and CAN/SDI. Group SDI works on the basis of standard profiles 
in relatively broad subject areas. These profiles are establish- 
ed by the service center and advertized by that center. If a 
standard profile is a reasonable match with the interests of 
a particular scientist he may subscribe to it at a rate consid- 
erably less than the rate he would pay to have a profile custom- 
made to his interests . Because NASIC is to service such a large 
audience, it should be possible to identify several groups of 
scientists with relatively similar interests. Indeed, this 
should be a high priority task in the NASIC program. Even if 
many of the interest groups thus identified do not match one of 
the group SDI profiles already in existence, it should be possible 
to negotiate a very favorable rate for a new group profile with 
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an appropriate service center. 

Before leaving the subject of SDI, it is worth pointing 
out that keeping up-to-date with newly published litt .^ature 
is only one facet of a complete current awareness program. 
Another important facet involves finding out what is going on 
in current research (i.e., who is doing what in a particular 
subject area). This involves the searching of indexes to on- 
going research proj ects, the most notable of course , being the 
files of the Science Information Exchange. NASIC must certain- 
ly make use of this important source and other specialized 
sources of information on ongoing research. Access to informa- 
tion on ongoing research must be an integral element in a com- 
plete information service. 

In the area of retrospective searching a much greater 
variety of approaches is possible. A rather gross summary of 
some major possibilities is given in Table 12. It seems likely 
that^ when fully operational, NASIC will maintain certain data 
bases in-house, for SDI, retrospective search and data retriev- 
al functions. Others will be accessed by on-line terminals lo- 
cated at NASIC headquarters or at another NASIC center. Yet 
others will be searched through the batch processing facilities 
of a service center or a data base producer. 

One of the major functions of the NASIC information services 
librarian (ISL) will be that of deciding, for any particular in- 
formation need presented to him, which available data base should 
be queried. He should also be able to quote to the customer a 
cost for the search to be conducted. The ISL will clearly need 
to have a wide knowledge of available bibliographic and data re- 
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TABLE 12 

POSSIBLE M3DES OF ACCESS TO DMA BASES 



FOR TYPES OF INSTITOnOMS AMD VOLUME 
OF USE 



DATA BASES, BY VOLUME OF USAGE 

NASIC 

SERVICE HIGH I17IMED1ATE OCCASIONAL 

IMSTITUnOM VOLUME VOLUME USE 



Large Direct on-line access to data Access via NASIC HQ, where Access via NASIC 

academic base data base is accessible where data base is 

library on-'line accessed off-line 

through producer 
or service cen- 
ter 



tfedium- Access through teletype or tel- Access via NASIC HQ, where Access via NASIC 

sized ephone to NASIC HQ> where data data base is accessible where data base is 

library base accessible on-line on-line accessed off-line 

through producer 
or service cen- 
ter 



Small Access through NASIC HQ, via Access via NASIC Hq> where Access via NASIC HQ, 

library mail or telephone data base is accessible where data base is 

on-line accessed off^-line 

through producer 
crt' service cen- 
ter 



sources t manual as well as mechanised. A very 'important: NASIC 
t ool will be a printred guide to available resources . Such a 
guide should be much more comprehensive than any presently 
available, and it must be indexed by specific subject areas. 
Thf. guide should be capable of indicating to the ISL which 
sources, are most likely to yield information of a part:icular 
type. For a request for information on nuclear instrumentation 
it should be capable of leading him directly to the Internation- 
al ^Juclear Information System (INIS) , the Information Service 
on Nuclear Science and Technology (UKAEA) and Nuclear Science 
Abstracts (USAEC) as most likely sources . The detailed descrip- 
tion of the data bases, and services available from them, will 
help the ISL decide which source is most likely to be profitable 
for the specific topic under review. Likewise, the guide should 
be capable of leading to PESTDOC for a search on rodenticides > 
to CAIN for a search relating to forestry, and to the services 
of the Highway Research Information Service (HRIS) for informa- 
tion on the design of parking decks. This NASIC guide must be 
kept up-to-date and tnif;ht therefore be issued in loose-leaf form. 
Each ISL would be encouraged to contribute to it by bringing to 
the attention of the compilers newly discovered sources, not- 
previously indexed, for a particular subject area. 

Through the ISL ^ NASIC customer should have access Co any 
data base, manual or mechanized, that exists and is generally 
available. In practice, NASIC must identify those data bases 
likely to satisfy the greatest volume of demand and make these 
data bases most accessible to the scientific community. A high- 
usage data base would be made readily accessible through an on- 
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line terminal connection established at NASIC headquarters. 
Some of the larger NASIC libraries may also have direct on- 
line connection to one or more data bases expected to be of 
major value to their own user group. For smaller libraries 
and less-used data bases a common mode of service will be a 
rapid-response on-line search requested by telephone call or 
teletype message to NASIC headquarters. It is likely that the 
NASIC headquarters will have facilities to access a number of 
important data bases remotely. Additional data bases may be 
maintained in-house for on-line or off-line access. For the 
less-used data base, however, NASIC will request service from 
a service center when the neep for such service arises. It is 
desirable, therefore, that thfe NASIC guide should Identify a 
preferred service center for/ each data base. Arrangements 
should be made with this center for rapid turnaround service 
if the need for fast response is critical for a particular 
request . 

Initially, at least, it is likely that most searches will 
be conducted by NASIC Information specialists or information 
specialists associated with other centers. However, the NASIC 
headquarters may have an interactive information facility that 
is open to members of the scientific community. If a scient- 
ist wishes to visit this facility, and use the on-line resources 
directly* he may be allowed to do so. Similarly, some of the 
larger academic libraries > having their own on-line facility , may 
allow scientists access to data bases directlyj and will provide 
some training facilities to make this possible. Finallyj if with- 
^ in the NASIC community a large academic department is identified 
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as having a very high level of demand for a particular data base 
arrangements might be made to provide terminal access within tha 
department. If NASIC is able to distribute a number of terminal 
installations throughout the region, and each of these installa- 
tions can be used to access a number of different data bases, 
the combined volume of business thus generated may be more than 
sufficient to make this level of accessibility justified, both 
In terms of convenience and economics- 

Since NASIC will be dealing with data as well as with bib*- 
liographic retrieval, an important element in the overall NASIC 
program may be a referral service. The information services 
librarian, with the assistance of staff at NASIC headquarters, 
should be capable of referring a scientist to any likely source 
of information, whether this is a formal data base, an informa- 
tion analysis center , or an individual consultant - In this im- 
portant referral function^ NASIC may work closely with the Na- 
tional Referral Center for Science and Technology. 

Finally, at a point in time when NASIC is fully operation- 
al and terminals are widely available in academic institutions 
throughout the area, NASIC should consider the possibility of 
initiating a program, using computer facilities at NASIC head- 
quarters » or a NASIC center, for providing on-line support to 
the creation and exploitation of personal files. Through this 
service an individual scientist could purchase access to pro- 
grams that would allow him to build files of data or references^ 
add to them, delete from them, search them on-line, obtain ma- 
chine listings, and so on. Personal files consitute a very im- 
portant source of information for most scientists. Such files 



are not replaced by general information services, however 
good these services are. Indeed, formal information serv- 
ices, especially SDI operations » are used to feed personal 
files . Unfortunately , the conventional personal file tends 
to be a simple pigeonhole system, with very limited access 
capabilities. On-line systems designed to aid the researcher 
in the efficient exploitation of his own files (e.g. » RIQS 
at Northwestern University and AUTONOTE at the University of 
Michigan) have proven very popular in the academic community. 
Such a system might be an important element in overall system 
planning. Ideally, through a terminal located in his own office 
or department, a scientist should be capable of accessing his 
own files> or at least indexes to them> as well as accessing 
any of several outside data bases made available to him through 
NASIC. From the same terminal, then, he is given convenient, 
rapid access to a very wide range of bibliographic and data 
resources. This type of facility may substantially alter his 
present information gathering habits, and he may well judge 
that the benefits available from such a service more than off- 
set the costs involved in using it. With a very large operation 
these costs may not be excessive. 
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G. SOFTWARE ASPECTS OF INFORMATION SERVICE CENTER OPERATION 



The operations and procedures necessary to provide the 
information services and products desired of NASIC will of 
course be largely coniputer based. Thus, It will be necess- 
ary to discuss the various types of computer programs or 
software needed by NASIC to perform its information services 
function, and the limitations of and problems associated with 
such software. Criteria for selecting and evaluating such 
software will be presented In order to assist NASIC in choos- 
ing among the possible alternative sources or methods for pro- 
viding the desired information services and products. 

NASIC may decide to purchase, lease> or license a cer- 
tain data base and the appropriate software for its own in- 
house use (or for use on a nearby computer on which NASIC 
has purchased time), or NASIC may simply purchase certain in- 
formation services from a remotely- located information center - 
A different collection of software considerations apply whether 
NASIC uses a certain data base in-house or purchases services 
from a remote service center. But, certain software consid- 
erations apply to all information retrieval software no matter 
which group actually operates it. In the following discussion 
the evaluation and selection criteria that apply to all informa- 
tion retrieval software will be presented. Then, the problems 
that arise and the additional evaluation and selection criteria 
that must be considered will be given for the situation when it 
becomes necessary or desirable to use software on a data base. 
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computer, or computer system different from the one for 
which it wts written. The latter will be the situation 
if NASIC decides to obtain a certain data base and appro- 
priate software for in-house operation rather than simply 
purchasing information services from a remotely-located 
center . 

For the operation of an information retrieval system 
the following four groups of computer routines are required: 
(1) Input/Output routines, (2) Search routines, (3) Data 
Management routines* and (4) Data Base Transformation rou- 
tines . 

Input routines are those computer routines which oper- 
ate on the user's request for information. They can correct 
the form of the user's request by checking for errors in for- 
mating, punctuation, or spelling; or they can elucidate or 
expand the user's request by requiring more specificity or 
precision in his request statement^ or by imposing on him or 
informing him of areas closely related to his area of inter- 
est. Examples of the latter class of input routines are 
thesaurus lookup routines. The inputting of the user's re- 
quest for information is frequently done on-line , though it 
can be done off-line. Output routines are those which put 
the user's output onto the medium (cards, paper, or magnetic 
tape) and into the format and order that the user had indica- 
ted, and make provision for routing this output to him. Soft- 
ware for input/output operations will be discussed further in 
one of the following chapters - 
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Search routines are those parts of the computer pro- 
gramming system that identify (by some sort of matching 
operation) those pieces of, or citations from, the desired 
data baseCs) that correspond most closely to the user*s ex- 
pressed information wants- Following the identification of 
the information desired by the user, certain portions of this 
information are extracted or reproduced and sent to the out- 
put routines for transmission to the ultimate user. Searches 
In information retrieval systems are of two main types: 
current awareness searches and retrospective searches . Soft- 
ware for search operations will be discussed further In a 
following chapter . 

Data Management routines are those routines which handle 
the complex billing and accounting procedures required for 
each user of the Information retrieval system and which also 
perform the statistical calculations desired by the users, 
the information center managers, ^ind the suppliers of the 
data bases. Here will also be discussed any devices used 
to achieve maximum throughput speed or the servicing of the 
largest number of user requests in the shortest possible 
time. Software for data management operations will be dis- 
cussed further in one of the following chapters. 

Data Base Transformation routines are computer programs 
which are essentially outside of the information retrieval op- 
eration- These routines operate on data bases to transform 
them into a more desirable form or format or even order. Al- 
so included in this group of routines are any programs which 
extract certain information from a data base and form a new 
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file with it (possibly to assist in the search operations). 
Data Base Transformation routines are not really necessary 
for the operation of a small information service center but 
become virtually essential when more than one data base is 
searched* since it is obviously desirable to have only one 
set of search programs. Software for data base transforma- 
tions wi] 1 be discussed in the next chapter. 

The basic objectives to be considered in evaluating and 
selecting software for an information services and products 
system are (1) the minimization of total cost to the user> 
(2) the minimization of total response time or the time from 
the initial submission of the user's request to his reception 
of the desired information and (3) the provision to the user 
of a reasonably easy system to learn about and to opeirate. (31) 
It should be noted that these basic objectives are often con- 
tradictory or in conflict - - minimizing cost niay make the sys- 
tem harder for the user to operate, etc. It is usually in 
attempting to achieve all these objectives that the problems 
associated with information retrieval software arise. 

The problems associated with the software required to pro- 
duce information services and products for outside customers 
appear to be divisible into two classes: problems associated 
with virtually any complex computer programming system that 
uses input from many different sources (magnetic tape, disc, 
teletypes) and that must scan a large body of data; and prob- 
lems associated with producing software that is transferable. 
Problems that are associated with the former would be assuring 
the privacy and inviolability of each user's data and reor- 
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ganizing or grouping the data and allocating computer memory 
and storage devices in an optimal way to the various data 
sources so as to secure the maximum throughput speed. These 
problems will be considered further in the chapter on data 
management software. As to the latter, when it is desired 
to transfer a group of computer routines to a data base» com- 
puter, or computer system different from that for which the 
software was originally written (as would be the situation 
for data bases that NASIC intended to use in-house) , it is 
necessary to have software that is "transferable''. For a cer- 
tain collection of software routines to be transferable, they 
must possess characteristics such as being written in a conimon 
high-level language and modularity (or being composed of sep- 
arable pieces) . The characteristics required for software to 
be transferable will be described in detail in the last section 
of this chapter. 

The criteria for selecting and evaluating software to be 
used by NASiC in providing the information services and products 
desired by their customers will now be presented. These criter- 
ia appear to be divisible into two main classes: general cri- 
teria or criteria that apply to any piece of software designed 
for any part of an Information retrieval system and **transfer- 
ability" criteria that apply when it is desired to transfer 
some piece of software to a data base» computer^ or computer 
system different from that for which the software was orig- 
inally designed. The second group of criteria would only be 
applicable when NASIC was considering using some data base 
in-house (or on a nearby computer on which NASiC had purchased 
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time); whereas the first group of criteria would apply to any 
software employed by NASIC either directly or indirectly through 
an intermediary to provide information services for its custom- 
ers, (These latter criteria include those used to evaluate soft- 
ware employed by an information center from which NASIC purchases 
some information services or products). The general criteria 
that apply to any piece of software designed for any part of an 
information service system are listed in Table 13 and are des- 
cribed more fully in the paragraphs below. 
Documentation 

Adequate documentation is obviously necessary for the soft- 
ware of any information retrieval system that NASIC might attempt 
to purchase or use in any manner. As a part of this documenta- 
tion, all options available for each piece of software must be 
adequately described and illustrated. The documentation must 
include a complete description of all error conditions and mess- 
ages, and a precise description of the steps necessary to correct 
each error condition , The latter is particularly important since 
NASIC personnel will have to identify and correct a ly error aris- 
ing in the processing of a user's request. This documentation 
should also include extensive examples of each of the important 
types of request to assist NASIC personnel In the use of this 
system. (Documentation of an existing system is usually but 
not always available) . 
Reliability (Operabillty) 

NASIC obviously does not want to use a system that Is not 
very reliable in spite of any other considerations* But pro- 
blems arise in determining Just how reliable a system really 
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TABLE 13 



EVALUATION AND SELECTION CRmEUA FOR INFOim - 
TIOM REmiEVAL SOFTMARE (GENERAL CRITERIA) 

(Criteria ordered roi^ghly in decreasing order of importance) 
Documentation 
Reliability (Operability) 
Throughput time 

Availability of updatir^ capabilities 
Cost 

Availability of software maintenance from original supplier 
(Vendor guarantees of software reliability or future main- 
tenance) 

Dtigradation of response time with number of users (For time- 
sharing systems) 

Operating time 

>toiinum number of users (For time-sharing systems) 
Number of people wanting to use this particular software 
Ease of iisage 
Uniqueness of software 
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is. It is certainly not adequate to simply accept the assur* 
ances of the software developer. Present or former users of 
this system should be asked their opinion of its reliability. 
Possibly NASIC should run an acceptance test on certain inform- 
ation systems to determine for itself their reliability. (It 
is assumed that any system under serious consideration by NASIC 
must be operable. NASIC would not at this stage want to con- 
sider any software that was still in the developmental stage. 
NASIC would certainly not want to have to debug or rewrite some- 
one else's program). 
Throughput Time 

It is obviously desirable for NASIC to obtain that inform- 
ation retrieval system which has the minimum average throughput 
time (or minimum time from the submittal, of the user's request 
to his reception of the output). But comparisons between the 
average throughput times for different systems are difficult . 
The throughput times for each of the four basic operations asso- 
ciated with information retrieval - current awareness services, 
batch retrospective searches, interactive retrospective searches 
and document delivery - are only comparable within these opera- 
tions. Throughput times for the same data base for each of these 
four services will differ widely for the same system due to the 
widely' differing nature of these four services. Throughput times, 
of course, also differ according to the data base used. Thus, 
when comparing throughput times between times for the same data 
base as well as the same basic operation. This implies that it 
will be hard to obtain a substantial body of data with which to 
rompare system throughput times. Also, occasionally system A 
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may have faster throughput times for current awareness searches 
but slower throughput times for batch retrospective searches 
than system B for the same data base, in which case the compar* 
ison between the throughput times for systems A and B would in- 
clude a complicated tradeoff involving the comparative NASIC 
usage of current awareness services and batch retrospective 
searches. It is also possible that some installations may pro- 
vide different throughput times for the same service on the same 
data base depending on the price charged the user. In these 
cases NASIC would have to analyze the trade-offs, though the 
final choice as to the option taken would probably be left to 
the ultimate customer. It should be pointed out that the through- 
put times compared should be those for the entire operation, not 
for various parts of the operation, since these would be largely 
meaningless . 

Availability of Updating Capabilities 

It is certainly necessary to have adequate software facil- 
ities available for updating or correcting any data base to 
be used by NASIC since most of the important data bases are 
being continually updated and corrected. But, because of 
this latter fact, it seems reasonable to assume that any group 
that has a data base will have some sort of facilities to update 
it. Thus, this consideration should not arise too frequently. 
As to the possibility of the software itself having the capacity 
of being updated (by adding new functions i operations or capabil- 
ities), this is probably not too important since it will often be 
possible in the future to find and use better, more flexible soft- 
ware on the common data bases. Also, it would seem that improving 



the capacity of the software would be something to be done in 

the distant future rather than now. 

Cost 

Costs will have to be determined for each software sys- 
tem and data base under consideration for each of the first 
three of the four basic uses or services: (1) Current aware- 
ness services ; (2 ) Batch retrospective searches ; and (3) In- 
teractive retrospective searches. For each system and data 
base, it will be necessary to estimate the costs of doing a 
number of typical searches or operations for each of these 
three basic types of uses. It will be necessary to survey a 
number of suppliers of data base services to determine what 
sort of costs are typical and reasonable. However, the entire 
question of cost is largely inseparable from the questions of 
which data bases are being used and which capabilities are 
provided by the software. Thus, cost is less a software prob- 
lem than an entire information services system problem, though 
there may be occasional instances in which basically equivalent 
software facilities operating on the same data base differ 
somewhat in cost, in which case cost will be the most important 
consideration , 

Availability Of S o ftware Maintenance From Original Supplier 
(Vendor Guarantees Of Software Reliability Or Future Mainten- 
ance) 

To insure the reliability of software used by NASIC, or 
to correct any errors that might be found in such software in 
the future, it would be desirable to obtain some sort of assur- 
ances from the original supplier of any software used that it 




will be adequately maintained for the forseeable future. At 
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this stage NASIC certainly does not want to take on the task 
of maintaining software, so any maintenance that must be done 
in the future should be guaranteed by the original supplier. 
Degradation Of Response Time With Number Of Users (Response 
Time) 

In interactive processing or processing under time-shar- 
ing, response time, the time between the submission of the 
user's request and the computer acting upon the request* is 
by far the most important and most: basic component of both 
the total throughput time and the cost- Thus, when estimat- 
ing either the total throughput time or the cost for a system 
employing time-sharing, the response time and its degradation 
with number of system users should certainly be considered 
first and emphasized most heavily in the analysis. However, 
response time will certainly differ widely for each of the 
two basic time-sharing operations - interactive current aware- 
ness service and interactive retrospective search , so that the 
comparison of response times between systems will certainly 
have to be done separately for each of these two services- 
Also, response time for each of these two services will differ 
widely according to the size of the data bases searched and 
the number of keys searched on. 

Thus, it will be difficult to obtain a substantial amount 
of comparable data with which to compare the response times 
of different information centers- Also if the response time 
for interactive current awareness services for system A Is 
better than that for system but the response time for in- 
teractive retrospective searches is worse, complex trade-offs 




will be involved in this decision. 

"Ill- 



Operating Time 

This criterion does not refer to time-sharing or inter- 
active processing, but refers simply to the amount of com- 
puter time required of a certain batch-processing operation. 
It is an important consideration not in itself but because 
it has an important effect on both che total throughput time 
and the cost for both current awareness services and batch 
retrospective searches- Thus, operating time will usually 
noc be considered by itself, but will be considered in its 
effects on the throughput time and the cost for each opera- 
tion. 

Maximum Mumber Of Users (For Time-Sharing Systems) 

This would seem to be of importance only in the distar.t 
future when NASIC might wish to service numerous requests for 
information from the same data base at the same time. 
Number Of People Wanting To Use This Particular Software 

This would have to be determined by taking surveys of 
possible users which is not the function of any software eval* 
uation and selection criteria. 
Ease Of Usage 

Ease of usage is one of the less important criteria 
since NASIC personnel will ofLen be the principal user.s 
of any data base and software system acquired by NASIC- 
Ease of usage will certainly have to give way in situations 
where a popular or usetul data base can only be reached or 
can best be reached (best in the sense of providing the most 
relevant information at the lowest cost) through a hard-to- 
use software system. (The Ohio State University system might 
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be rather hard to use since it requires the estimation of 
probabilities of relevance to the user), (33) 
Uniqueness 0£ Software 

This consideration should not arise too frequently in 
practice due to the intense competition in the software in- 
dustry. It is rather unlikely that only one piece of soft- 
ware will be available to do any important particular job or 
funct ion. 

Questions relating to the fairness of the franchise arrange- 
ments or the number of present users of a certain syctcm and who 
uses which parts of the system for how long or variations in the 
software usage cost with the amount of usage (the number of in- 
put requests) appear to be basically data base questions or to 
be intertwined with or intimately related to data base questions. 
A franchise arrangement would probably include the data base for 
which the software was written. Even if the franchise arrange- 
ment did not include the data base, the software would probably 
be specific to or intended for use on a certain data base in a 
certain format, and thus intimately involved with that data base. 
The number of present users, who they were, which parts of the 
system they used, and how long they used the system is obviously 
very highly dependent on the data bases empi'oyed and their pop- 
ularity and usefulness. The usage cost of the software is ob- 
viously highly dependent on the size and format of the data base 
searched. 

The criteria that apply to any software that NASIC wishes 
to transfer to a different computer, a different computer sys- 
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tern, or a different data base are now presented in Table 14 
and described mere fully in the paragraphs below* It should 
be especially ncted that the criteria of Table 14 are not a 
replacement for the criteria presented in Table 13 but are an 
addition to it for those situations and only those situations 
in which NASIC wishes to transfer software to another computer, 
another computer system, or another data base. 

Dependence On The Particular Computer (O^erability On A Widely- 
Used Computer, Machine Limitations) 

In discussing the trani^fprability of a software system, it 
is necessary to cnn.qider the degree of dependence of the soft- 
ware system on the peculiarities and limitations of the partic- 
ular computer that it had been operating on - the characteristics 
of the particular central processing unit and the presence or ab- 
sence of certain peripheral equipment, such as card or paper tape 
readers, card or paper tape punches, magnetic tape readers, disc 
storage units, printers, and their capacitie>s, requirements, and 
other characteristics* To ensure the transferability of a soft- 
ware system the particular cnmputer that it had originally been 
run on should be a commonly available computer with no unusual 
peripherals (for cards , tapes or disc storage) or unusual per- 
ipheral combinations or sizes* It is most iraportant to have a 
sufficiently large amount of the correct type of storage. (The 
University of Pittsburgh system seems to be machine dependent, 
it will only operate on a PDF-IO with similar per ipherais . ) (32) 
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TABLE 14 



EVALUATION AND SELECTION CRITERIA POR TRANSFERABLE 
INFORMATION RETRIEVAL SOFTOARE 

(Criteria ordered roughly in decreasing order of irrportance) 

Dependence on the particular computer (Operability on a widely- 
used conputer, machine limitations) 

Dependence on the prograinning language (Use of a imiversal or 
popular language) 

Dependence on the particular operating system (Operating system 
availability) 

Operability on different data bases 

ttodularity 
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D ependence On The Proaraniming Language (Use Of A Universal 
O r Popular Language) 

To obtain transferable software it is also necessary 
to consider the degree of dependence of the software system 
on the peculiarities and limitations of the particular lang- 
uage that the software is written in. To ensure transfer- 
ability the language that the software was written in should 
he a common widely-understood high-level language like COBOL 
or PL'l. 

Dependence On The Particular Operating System (Operating 
System Availability) 

When discussing the transferability of a software sys- 
tem, it is necessary to consider the degree of dependence 
of the software system on the peculiarities and limitations 
of the particular operating system and compiling system that 
the software had originally been run under or compiled on. 
To ensure the transferability of a particular software sys- 
tem, the operating system that the programs have been run 
under and compiled on should be a common system without any 
peculiar adaptations or features. It is of course rather 
hard to define ''common'' for an operating system (or a com^ 
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puter or programming language) but it should be fairly ob- 
vious when an operating system (or a computer or programming 
language) is definitely peculiar or distinctly rare. 
Operability On Different Data Bases 

This is an important consideration since if some soft- 
ware is to be transferred it is rertainly desirable to have 
the trans f erred software do as much for NASIC as possible » 
for it to operate on as many of the new data bases desired 
by NASIC as possible. (The phraGC "new data bases desired 
by NASiC could of course refer to data bases already used 
by NASIC but with whose software operating characteristics 
NASIC is unhappy, as well as data bases that NASIC wished to 
add to its collection). However, the direct applicability 
of software to data bases that were obtained from a source 
different from the software is improbable. Rewriting or 
modification of the program or conversion of the data base 
might be necessary, neither of which tasks NASIC might want 
to attempt itself. 
Modularity 

Modularity in constructioti refers to the division of a 
computer program into a number of small logically separate 
pieces . It appears that this is not too important a consid- 
eration for NASIC since it is doubtful that NASIC would at-- 
tempt to rewrite or modify or correct any software it obtain- 
ed during the next few years, which is where modularity is 
most important - 
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H . SOFTWARE CONSIDERATIONS: DATA BASE FORMATS 



The basic reason for considering the conversion of a 
data base into another format, or even order, is that any 
information services center will almost always be operating 
on more than one data base. Since data bases originate from 
a large number of different sources and since no common form 
or format has become accepted by most data base producers, 
the data bases obtained by an information services center 
are J iable t:o be in different forms and formats. But, since 
it would be a tremendous duplication of programming effort to 
write different search programs to search data bases in diff- 
erent formats, 'or even to write different search routines for 
inclusion in the same computer program to handle different 
formats , it is usually desirable for an information services 
center to convert all of its data bases into the same form 
and format , Becau.Re of the desirability of this conversion , 
it is necessary to consider the different formats in which 
data bases are written. It is also necessary to consider the 
problems raised in attempting to search each forme. t , so that 
an information services center can choose the dat;a bas^e format 
which appears most cost-effective for searching. 

A number of different standard record formats are used for 
various data bases. One of them is the PANDEX record format 
used for all data bases operated by the Mechanized Information 
Center (MIC) at Ohio State University in Colimbns, Ohio (33). 
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Figure 4 shows the layout of a PANDEX record. It should be 
noted that the PANDEX record format makes no provision for 
a record directory, a group of words located at the beginning 
of a record indicating the locations of the start of each field 
of information in the record. It is fairly reasonable not to 
have a directory as long as each field of information is the 
same length from record to record. However^ when fields are 
of varying lengths from record to record, as would probably 
be df^sirable if the information in the data base varied consid- 
erably in length from record to record^ such a directory is 
very desirable. Otherwise it is necessary either to fill up 
the short fields in a record with filler to inake them conform 
to the standard length for that field, or to search through 
each field of a record to find information located in the last 
fields. The former option is undesirable since it greatly in- 
creases the physical size of the data base. The latter option 
is of course highly undesirable since its use would increase 
the time required for searching tremendously. It would also 
make modification or correction of a record very hard and time 
consuming. Thus, in many cases directories in each record of 
a data base are very desirable and thus commonly used. 

The MARC II data base containing bibliographic citations 
prepared by the Library of Congress is an example of a data 
base whose records contain a directory field, (34) The MARC II 
record format is shown in Figure 5, The record directory is 
composed of a series of fixed*length entries each containing 
the identification tag, the length, and the starting charac- 
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L\YOirr OF THE PANDEK (PX) RECORD 
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FORMAT OF THE MARC II RECORD 
(A Record Containinf^ A Record Directory) 
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ter position in the record of each of the information (var- 
iable) fields . 

Converting all the data bases in a center's collection 
to a common form or format to avoid duplication of the search 
routines is usually a fairly expensive and time-consuming ef- 
fort* This is true not only because most of the data bases 
to be converted are fairly large but also because of the prev- 
alence of errors in the data bases themselves or in the docu- 
mentation describing these data bases* Thus > software to per- 
form data base conversions will usually have to include ex^ 
tensive error checking features for the data bases. Even so, 
errors in the documentation for a data base or inconsistencies 
in its format and data representation will probably necessitate 
rewriting of the conversion routines and a number of false 
starts. The University of Georgia Computer Center* whose ex- 
perience is probably typical, reports that for most of the 
data bases that they attempted to convert to their common for- 
mat (The Standard File Format (SFF) of the Chemical Abstracts 
Services) the conversions had to be done at least twice, due 
principally to inadequate , inaccurate , or absent documentation 
or inconsistencies in the format and data representation of the 
data bases converted. (35) Inconsistencies in format and data 
representation are usually not detected until during or after 
the conversion, necessitating some modification of the conver- 
sion routines and then a second conversion attempt* Separate 
software will of course probably have to be written for each 
data base to be converted and checked for errors, so that writ- 
ing the software for conversion of a large number of data bases 



is liable to be very time-consuming and expensive. Thus, the 
effort and expense of writing the appropriate software plus the 
difficulties involved in having to deal with inadequate or in- 
accurate documentation for the data bases and inconsistencies 
in their format and data representation imply that the conver- 
sion of even a small number of data bases to a common format 
will be very expensive and time-consuming. 

Another reason for wanting to convert data bases to another 
form or format arises from problems caused by the hierarchy of 
concepts according to which a data base is logically organized. 
Logically (though not in actual practice) a data base is organ- 
ized by a hierarchy in which broader^ more inclusive index terms 
represent and include groups of more specific index terms. An 
example of such a hierarchy used to logically organize a data 
base ifl the following taken from the ERIC Coeducational Resources 
Information Center) Thesaurus : ''guidance personnel^^ is the broad- 
er or more inclusive index term which covers '^adjustment counsel- 
ors^', ^'elementary school counselors'^ , and "special counselors" 
as narrower index terms. (35) If a data base were organized or 
indexed according to this hierarchy, "guidance personnel" would 
be the principal entry point or principal heading in a search and 
"adjustment counselors'\ "elementary school counselors", ptc. 
would be the secondary entry points or headings under this prin- 
cipal entry point. If a search were done on a data base indexed 
in this manner, it would proceed by first locating "guidance per- 
sonnel*\ the principal entry point, and then locating "adjustment 
counselors", say under this principal heading. But, since this 
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method of moving down the hierarchy in a search would obvious- 
ly take more computer time than if the area of interest could 
be located directly (especially if the hierarchy of organiza- 
tion is of more than two levels, unlike this simple example), 
in actual practice data bases are never ordered according to 
a hierarchy. Thus, it is usually desirable to have a data 
base indexed by the lowest level terms In the hierarchy, so 
that, in the above example , "adjustment counselors" , "elemen- 
tary school counselors", etc, would be the principal en^iry 
points. When the form, format, or even order of a data base 
is changed so that the terms at the lowest level of the logical 
hierarchy are the principal entry points, the data base is said 
to be inverted- The use of an inverted data base substantially 
speeds up the searching process, since the specific topics of 
interest can be located directly- (It is, of course, assumed 
that the user has some very specific area in mind and that this 
is clearly indicated) , 

Using an inverted data base to speed the searching process 
of course implies that the original data base and any additions 
to it are both put into inverted form before any searches are 
performed- But, inverting the original data base and additions 
to it require that same sort of data base conversion be perform- 
ed on it. Such data base conversions to invert a data base are 
of course, subject to all the dif f iculti&s mentioned before in 
writing elaborate software to do the conversions and in perform- 
ing the actual conversions; but, in addition, these data base 
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inversions can involve a sort into some new order. Thus, in- 
verting a data base will indeed make all subsequent searches 
on it less costly and less time-consuming, but inverting the 
data base (and additions to it) will usually be an expensive 
and time-consuming process. 
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I, SOFTWARE CONSIDERATIONS: SEARCH OPERATIONS 



The search operations included in an information products 
and services system are of two major types; current awareness 
searches and retrospective searches. Current awareness sear- 
ches consist of the matching of information in user profiles 
against the data in the latest issue of a data base and re- 
trieving all the relevant information therefrom. Retrospec- 
tive operations consist of a search through past issues of a 
data base for citations or material relevant to a given user 
request. Because of the differing sizes of the data bases 
involved and the differing states of the user*s profile or 
request , current awareness searches and retrospective searches 
often differ quite widely in the manner in which they are per- 
formed and the problems encountered in conducting them. 

Current awareness operations consist of searching the 
latest issue of some data base for all citations and material 
relevant to the specific areas of interest indicated on an 
already-available user profile. Since only the latest issue 
of a data base is searched In current awareness operations, 
and since updates for most conunon data bases are issued fair- 
ly frequently (every two weeks or every month), the size of 
the data base searched is usually fairly small. The profiles 
of the users are also already available and have usually been 
used in previous current awareness searches, so that no modi- 
fication of them is contemplated or desired. The user profiles 
employed in current awareness searches are considered to repre- 
sent precisely what the user wants. Because no modification 
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of the profiles is permitted and the data bases are small, 
batch processing rather than interactive usage is usually 
quite acceptable for current awareness services- Failure to 
obtain the citations or information the user desires are un- 
common due to the fact that most profiles have been used be- 
fore in current awareness searches . Since current awareness 
searches are done when the data base is updated and since most 
data bases are updated fairly often, the user will not usually 
have to wait a long time for his results. The user's need for 
the information provided by a.current awareness service is also 
usually less urgent than in other situations, since current 
awareness searches are used by researches principally to keep 
themselves up-to-date , 

It is however virtu.illy essential for the information ser- 
vice center to provide a fairly rapid response with current 
awareness services if only because the next current awareness 
search will be due shortly. An information center must put 
current awareness operations on a production basis if they are 
to be successful. Provision must also be made for the ultim- 
ate user to change his profile on the basis of previous cur- 
rent awareness search results given to him. Facilities for 
interaction with the user must be available and of adequate 
size to respond quickly to his new needs. 

As to typical costs involved, the University of Georgia 
charges $5 to $10 per profile per current awareness search for 
a single data base, (32) Ohio State University charges $300 
per profile per year for current awareness services which 
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appears to be somewhat higher than the prices offered by the 
University of Georgia, since current awareness searches at 
the University of Georgia are often performed weekly or semi-- 
monthly. 

The current awareness operations of the University of 
Georgia will now be described. (32) The University of Georgia 
Information Center Staff perform the construction and coding 
of search profiles after interviewing the customer. The 
search profiles are then entered on an interactive system 
using CRT terminals- Syntax editing of the profiles is done 
immediately with diagnostics returned to the operator of the 
terminal. Profile updates can be performed as requested by the 
customer. The output of a current awareness; search can be on 
either standard size paper or cards and will contain the title, 
primary bibliographic citation and any included index terms or 
codes. Current awareness searches performed weekly, bi- 
weekly, semimonthly, monthly, or quarterly depending on the 
frequency with which the particular data base is updated. 

Retrospective operations consist of a search through a 
part of the past issues of some data base, the number of years 
of issues to be searched being specified by the user> for all 
citations and material relevant to the specific areas of in- 
terest indicated by the user- Unlike the situation for cur- 
rent awareness searches> an interactive on-line system or 
semi- interactive sys tem seems to offer substantial advantages 
to the user. When a user does a retrospective search, he does 
not have a previously*prepared and tested profile or request 
^ (unlike the situation that usually exists for continuing cur- 
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rent awareness searches). The user is almost: always making 
a retrospective search on a specific area of interest for the 
first time- Thus > it is very helpful to him if he can alter 
his reqiiiest after trying it out on a small part of the past 
issues of the data base he wishes to search. By being able 
to modify his search request before he has done the entire 
search on the usually very large data base, the user saves 
both time and money since he does not have to pay for or wait 
for the results of a search that are of very little value to 
him, that do not contain precisely what he wants. Beyond this, 
it would certainly be very wasteful indeed if his search re- 
quest were rejected because of a format or spelling error. 
Thus, because the user profile or request is usually not well 
set for a retrospective search and because the data bases to 
be searched retrospectively are very large, it is usually de- 
sirable for the user to be able to alter or negotiate his pro- 
file before he has searched all the desired issues of the data 
base. Retrospective searches can be totally interactive and 
on-line or they can be "semi" interactive on-line, as they 
are at the University of Georgia where correction and updat- 
ing of the user's profile or request can be done interactively 
on-line. (The actual retrospective searching is however done 
off-line, but, the interactive on-line negotiation of the pro- 
file is certainly valuable for retrospective searching). 

Since the data baees searched in retrospective operations 
are usually very large and since the searching is often done 
interactively on-line, it is necessary that the search opera- 
tions themselves be as fast and efficient as possible. It is 
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in searching large data bases that the advantages of file in- 
version, menuioned in the previous chapter as a method of speed* 
ing up the search operation , become most apparent. However, to 
obtain these advantages in speed, it is necessary to invert each 
addition to the data bases. Since each of the common data bases 
is updated fairly frequently, this means that much time and mon* 
ey must be spent to obtain inverted data bases. 

As to typical costs involved, SDC, Systems Development Corp., 
charges $43 per connect hour for on-line retrospective searches 
on the Chemical Abstracts Condensates (CA-C) data base and $32 
- $35 per connect hour for on-line retrospective searches on 
their other data bases- For their semi --interactive retrospective 
searches (having a response time of about two weeks) , the Uni- 
versity of Georgia charges $70 per profile per volume on many 
of their data bases such as Chemical Abstracts Condensates (CA-C) 
odd and even and Biological Abstracts (BA) ; but $35 per profile 
per volume on other data bases such as Bioresearch Index (BIORI) . 
However^ due to the differences in the services provided, it is 
not reasonable to compare directly SDC's charges for retrospec- 
tive searches with those of the University of Georgia. 

SDC provides interactive on-line retrospective searches 
for past issues up to seven years old for some data bases. 
The files searched contain from 16 to 29 different categories 
of bibliographic information (author, kejrword phrases, title, 
etc . ) . Some of these categories are directly searchable ,but 
all categories can be searched on subsets of the file. All 
the information in a citation can be requested by the user 
through "print" commands. Response time for this service is 
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usually almost immediate » but like other time-sharing systems 
degrades with increasing numbers of other currently-active 
users , 



SOFTWARE CONSIDERATIONS: INPUT/OUTPUT PROCESSES, DATA 



MANAGEMENT REQUIREMENTS 



This chapter will discuss the input/output routines and 
then the data management routines required by a typical in- 
formation services and products center. This discussion will 
thus complete the consideration of all the different types of 
routines that form an information retrieval system. 

The input routines are those portions of the entire in- 
formation retrieval system which both permit and assist the 
user in constructing his profile or request lov information. 
Profiles can be constructed in natural English prose as in 
the Lehigh University LEADERMART sys must only use a 

restricted English vocabulary and/or logical indicators - 
Input routines that process natural English must of course 
have some capabilities for syntactically analyzing English 
sentences and must also have capabilities for searching 
^thesauri for synonyms and narrower terms in the hierarchy 
for the words employed in the user's request. Such process- 
ors will often have the capability for interacting on-line 
with the user to negotiate his profile, to suggest related 
areas of interest to the user, or to force him to define his 
area of interest more precisely. Input routines that pro- 
cess requests written in a restricted vocabulary have to 
provide facilities to correct format and vocabulary errors 
in user requests. Such processors also often provide fa- 
cilities for looking up synonyms and terms related to those 
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given in the user's request » though sometimes the user is 
provided with a printed thesaurus to assist him in writing 
his request. 

Input routines may also provide for profile inversion. 
Here this reduces to thesaurus look-ups to find the broad- 
er terms and the related terms connected with the user's re- 
quest terms. (This assumes that the data base searched is 
ordered serially and is not in inverted form). This invert- 
ed profile is then matched against the data base to be search- 
ed on. Profile inversion can often be used as a substitute 
for file inversion to speed up the searching process. The 
use of profile inversion often eliminates Lhe very costly 
and time-consuming process of file inversion. 

Because of the above considerations » input software for 
an information products and services center can often consist 
of a number of elaborate routines which are costly to program. 
Employing the input routines can also contribute heavily to 
the cost of performing the entire operation. 

The output routines are those parts of the information 
retrieval system which put the user's output into the format 
and order indicated by him and then make provisions for route- 
ing this output to him. Output routines are usually " table - 
dt^iven" processing routines- The user indicates as part of 
his input the format and order in which he desires his out- 
put . These indications are reduced to tables » one for each 
user- The output routines are written to be sufficiently 
flexibile to use these tables to format and to order the out^ 
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put as the user desires. (The output routines have to in- 
clude sorting routines to put the output into the order in- 
dicated by the user). Thus* table-driven output process- 
ing provides the user with the maximum flexibility in out- 
put format and also permits many different orderings of the 
output items- The user is given maximum control over the 
form and order of his output. 

The data management routines handle the complex details 
of the billing and accounting procedures used by information 

services and products centers- Information service centers almost 
invariably provide access to a. niimber of different data bases 
which usually have different usage rates and rate structures 
and which derive from different sources* Billing each custom- 
er for the proper amount and performing the necessary account- 
ing and computation of royalties is made very difficult by 
having all those different usage rates and rate structures* 
Thus, it is necessary to provide fairly elaborate and thus 
rather costly computer routines to perform all these complex 
billing and accounting procedures for each of a large number 
of users- 

Included with this discussion of the data management 
routines are some comments on the provision of adequate pri- 
vacy to each user and the system measures used to ensure max- 
imum throughput speed- processing the largest possible number 
of user requests in the shortest possible time - The user 
wants his profiles and output information to be private and 
inviolate- He does not want any. possible competitor viewing 
them and he certainly does not want anyone accidently or pur- 
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posely destroying them. The problems of privacy and invio- 
lability are two of the most serious problems faced by time- 
sharing systems including those operated by information ser- 
vice centers. In fact, the problems may be more severe for 
them due to the large number of users they often have and the 
fact that they are often connected up with terminals at re- 
mote locations . 

The usual method of handling these privacy problems is 
assigning secret passwords or codes to each user and then in- 
cluding checks in the programs to ensure that only users with 
the right codes have access to the information corresponding 
to those codes. The University Of Georgia system assigns a 
secret code to each terminal and then attaches this code to 
all Information coming from or going to that terminal, so that 
user access is only possible to information produced at the 
terminal he is using. (35) 

One of the more obvious methods for an information pro- 
ducts and services cent'' to achieve maximum throughput speed 
is by grouping the searches for each data base together and 
then running each group separately only on the appropriate 
data base. However, this is only really successful when the 
center has a very large number of searches to perform in a 
short, but not too shorty period of time. The center must 
have a fairly large clientele and must be able to defer some 
of its searches for a few days anyway^ as would usually be 
possible with current awareness searches, for instance. Thus, 
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this method of achieving maximum throughput speed is useful 
for some centers, but often inappropriate for others. 

Another fairly obvious method of assuring maximum through- 
put speed is by the optimal allocation of computer memory and 
storage devices to the various sources of input data. Using 
disc storage for the data base information is probably the 
most efficient storage allocation for interactive on-line 
retrospective searches since large numbers of items must be 
readily available during such searches. For batch searching 
it is probably reasonable to have the data bases to be search- 
ed on magnetic tape, since the magnetic tape can be read in and 
searched serially. 
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K. COMMUNICATIONS ASPECTS OF INFORMATION SERVICE CENTER OPERATIONS 

The successful operation of an information service center 
will be greatly dependent upon the nature of its supporting com- 
munications system. There will be the need by an information 
services center such as NASIC to communicate with an appreciable 
number of remote sources and data bases- At the same time there 
will be a growing number of queries addressed to an operational 
services Cimtex. Input queries may be modest in number at the 
outset but will eventually be very large. 

A NASIC type center will use voice communication with its 
user customers and remote sources and service centers. It will 
need data transmission capabilities between terminals at the 
NASIC center and remote services centers. There will be slow- 
speed data transmission of query data and service center out- 
puts. There will be medium and high-speed data transmission 
primarily of the quantity outputs from service center search 
operations. High-speed lines will be of particular interest 
when interactive processes requiring display terminals will be 
put into use. 

NASIC will have the option to use direct dialing service 
or leased (dedicated) lines between communicating points. It 
can also make use of the services of commercial networks which 
has certain advantages and provides a way to consolidate cer- 
tain costs. 

A commercial network service usually includes cost^ o£ ter- 
minals and communications in the service fees* The discipline, 



procedures and protocols set up by a connnercial network facili- 
tate the process of accessing all other nodes on the network. 
This would be advantageous if a service center such as NASIC 
had a reasonable amount of traffic with these particular loca- 
tions , 

The NASIC operation will require the use of both dial-up 
and dedicated lines. A reasonable mix of these services can 
be established based upon the traffic needs that will evolve 
for the NASIC operation. Considerations which should be kept 
in view in determining when to use dial-up or dedicated line 
service are sununarized in Figure 6. 

Apart from communications with outside locations, a NASIC 
service center will have to deal with a difficult set of prob* 
lems concerned with interconnections at its headquarters facil- 
ity. A service operation will be at the center of input query 
activity, data base and data services accessing, as well as the 
output and delivery of search and processing results. At the 
beginning, staff personnel can probably manage to handle the 
interconnection problem, but as traffic increases this will not 
be adequate. There will be an insufficient number of staff 
people to do the job and the center's response capability will 
tend to deteriorate . 

To deal with the interconnection problem at the headquarters 
facility, NASIC should consider the application of a **communica- 
ticns bus" approach. This is being developed and applied in dif- 
ferent ways, but a good example of the kind of capability which can 
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OONSIDERATIONS IN DETAINING USE OF DIRECT DIAL-UP 



OR DEDICATED UNE COtWUNICATIONS 



DIRECT DIAL-UP 

No line charges there is no connection. 

Faulty line problems are solved by re*routing 

Data rates are lew: 1800 baud (bits/sec) is a maxiiaim for 
dial-up lines. 

Quality of lines varies; a bad connection could mean many 
errors. 

Cost reduction options are available (e.g., WATS service). 

Response time to establish a connection is a factor. 

Provides more accessibility for a service carter to the user 
environment. 

DEDICATED OEASED) LINES 

Use where message traffic is high, v/here ronote locations 
are fixed, \^ere average duration of calls is long. 

Lines are available when needed* a physical connection 
exists . 

Are high quality lines, tend to have lower error rates, 
cost more generally, can handle 50 kilobaud data rates. 

To insure against failure a duplicate back-up line is 
required. 

If line goes down, it usually takes longer to find a fault 
and correct it. 

Is indicated when privacy is to be maintained. 



FIGURE 6 
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achieved is given by the MITRIX system developed by the Mitre 
Corp. (38). MITRIX is a time-division multiple-access digital 
cominunication system employing cable technology with mini-com- 
puter controls and low cost components. It can be used in a 
closed loop configuration with the widest variety of communi- 
cation devices ^ terminals, processors and data lines being 
capable of being connected to the central high-bandwidth "bus" 
channel. The opportunities to implement this system in modu- 
lar fashion with modest initial cost, with abilities to auto- 
mate interconnection functions as needed, make this an attract- 
ive approach to consider, A center such as NASIC will be very 
much concerned with keeping account of its service activity; 
it will also be billing service charges. There will be a 
wide variety of rate structures and in some cases royalty 
payments will have to be applied. For any operation with a 
sizeable volume, billing, bookkeeping and royalty controls can 
only be done v;it:h automated processing aids. With a system 
such as MITRIX in view for handling the internal headquarters 
interconnection requirements , it is possible to pl^rr and imple- 
ment the needed processing aids in a systematic way as the vol- 
ume of business requires, all within reasonable economic bounds* 
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L. TECHNOLOGY CHANGE AND INFORMATION SERVICE CENTER PLAN- 
NING 

As part of its initial plan, NASIC will be getting in- 
to operation with the use of ''state-of-the-art'' methods* 
equipment and procedures* There will be no opportunity in 
the early phases of NASIC to carry on extensive development 
effort or to do any extensive adaptation of newly available 
technology for NASIC requirements . It is the case > however > 
that NASIC is beginning its operations at a time when, dra- 
matic changes are taking place in the communication and pro- 
cessing fields- These developments are of great importance 
and they can have a strong impact on any NASIC system which 
will be set up. Although there will be limits on the actions 
which it may be possible for NASIC management to make at this 
time> they should be aware of key technology developments and 
they should take these into account in their forward planning. 
It will be useful to review briefly the status of technology 
in several areas which do have a direct relationship to a 
NASIC-type service center operation. 

1. BRIEF OVERVIEW OF RELEVANT BUT CHANGING TECHNOLOGY 
a. Speed> Power 6e Cost Trends In Computer Technology 
The remarkable changes that are going on in the basic char- 
acteristics of computer tools is sununarized in a study cnade 
by Ware several years ago (39) . He listed and projected costs, 
speed, size and power factors for the period 1955 through 
1975. The results of Ware's findings have been remarkably 
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closa approximations of what is actually happening. In a 

twenty-year period the size of the computer will do, roase 

10, 000 fold for equal computational capability , l ; i^- 

same period the unit cost of computation is dowM 

startling figure of 200,000 fold, while speed h.;. 

ed 40^000 fold. There has also been an explosive >;;rv.M:h 

of installed capacity in the U.S. which, over the double 

decade 1955 - 75, will increase 160,000 fold. By 1975 

9 

computer processors should operate close to 10 operations 
per second. Ware's data is generally accepted as realis- 
tic, although the recent advances in the application of 
Large Scale Integrated Circuit (LSI) technology make some of 
his estimates very conservative . The clear implications are 
that there will be a continuing change in computer capabili- 
ties which will have a continuing effect on computer systems 
design, on usage patterns, and on costs. 

b. Integrated Network Systems In the last few 
years there has been remarkable evolution of large inte- 
grated computer systems which usually consist of a network 
of processors, associated memories, and peripheral input/ 
output equipment all interlinkedC40) . Often portions of these 
systems are separated by long distances. They may have huge 
data bases which can be accessed by hundreds to thousands of 
system users at locations which may be world-wide. The pres- 
sures to set up integrated i;ietworks are largely concerned 
with a need to support many system users and to keep costs of 
computer facilities within economic bounds. Integrated net:- 

ERIC 

™" -U2- 



works provide a way to share computer resources and to dis- 
tribute processing capabilities as well a55 programs and data 
so that large groups of users can be served by a common set 
of capabilities . 

It should be noted ■ lu'^ developing computer network 

systems are definitely ti.. . no become utility systems . 
There is an emphasis cu provide a spectrum of programs and 
processing functions which can serve for the common needs of 
most of the system users together with file storage capacity 
to serve many needs. Both batch and interactive processes 
can be supported in a single system. In order to achieve 
satisfactory response times and efficient time -sharing , the 
architecture of network systems tends toward distributed 
processin g conf igurationsC^l) ^ That is. the central processor 
for the system is usually powerful and supports the large 
files required by the system^ but many of the processor func- 
tions art; carried out at the periphery of the system by small- 
er satellite processors and only occasionally is there need 
for these smaller processors to utilize the central portion 
of the system for nev? instructions, programs, or data. The 
processing load for a system of this kind is distributed and 
makes possible a reasonably economic configuration . 

c. Fas t. Low-Cost Communications Advanrp^s in the util- 
isation of repeator relays in hard-wire communication lines, 
the refinement of digital communication methods including pulse- 
code modulation (PCM) has contributed to a major increase in dig- 
ital communications capabilities and a rapid Inwering of costs. 
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The full impact of this will be realizable in the next few 
years- For example, a communication channel of 1.5 me^^abit 
capacity for a 25 mile data-link, a short time aRo, could be 
obtained for one-fifth of the level of what the cost of a 
50k bit channel had been for some time previously. 
It has been estimated that the eventual cost for a data- link 
from a terminal to any location in the U-S- will level off at 
about $.30 per hour of usage (42). 

The low level of communication costs for a computer net- 
work system has major implications for the design of systems. 
Whether a computer is located in the next room or one hundred 
miles away will make relatively little difference in the over- 
all system cost(43). 

d. Large High-Speed Memories A new set of processing 

capabilities which will be available soon and which can have 

a major impact on system design are represented by high-speed 

mabb memories (44) . 

There are memories in experimental operation with cap- 

1 2 

acities of the order of a trillion biLs(lO ^ bits). As a 
reference it may be noted that the complete textual holdings 
of the Library of Congress can be represented by lO''^^ bits. 
There are various media used for storage, different recording 
and reading methods are used, and several companies will be 
marketing particular mass memory products in the next year 
or two. It is clear the good reliability can be achieved, the 
cost of the memory unit is at a reasonable level* and good av- 
erage access times are being achieved (of the order of pre^" 
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ent disc momory access times). There are several specialized 
mass memoi'ies which are used in combination with powerful cen- 
tViil procc^^^iors where transfer rates of the order of 40 mega- 
cycit'S : 10 bits/second) are attained. 

The . i vaiiability of mass memories of high capacity, good 
access times, good reliability at reasonable cost will very 
much eTiliaace the response times which are possible in distri- 
buted network systems because of the abilities to transfer 
large streams of programs, instructions, and data at high 
speed. With mass memories available* problems concerned with 
input processing* labeling* and structuring of data to be stor- 
ed will be intensified. It is quite apparent at this time that 
we do not know how to store data in a mass memory so that we 
can use the storage efficiently* even if we could afford the 
cost of input and take all the time needed to load the store. 
In any case* it seems certain that the impact of new mass mem- 
ories on system design will be a major one, 

e, Low-Cost Computer Processors In line with the gen- 
eral tirends mentioned above, there has been a large output of 
small, powerful low-cost computer processors in the past few 
years. These machines have been generally labeled "minicompu- 
ters** and are capable of performing a wide spectrum of process- 
ing functions. They can be used as stand-alone processors or 
as satellite processors in a network system and can be switch- 
ed from one mode to the other. Beyond the minicomputer in pro- 
cessor size is the ''microcomputer** > These computers are a de- 
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velopment of greatest importance in the computer field and 
will have a profound impact on machine processing systems 
for many years to come (45). Microprocessors or microcomputers 
are an application of Large Scale Integrated Circuit (LSI) technol- 
ogy which is usually implemented via semiconductor technol- 
ogy. In very general terms LSI consists of the use of hun* 
dreds or thousands of circuits or memory cells in a single 
integrated unit on a minute chip. Note that memory as well 
as computing capability can be implemented in LSI. 

LSI developments have been watched with much interest 
over the past few years but the picture that is now evolving 
and which brings new implications is that much higher levels 
of computing power will be achieved in LSI and these will be 
implemented at cost levels which were not considered possible 
in previous estimates. Not only will low-cost computing power 
be available, but large amounts of high-speed memory will also 
be available in LSI at unbelievably low costs. Processors with 
very high computing power will be producible as separate LSI 
units; high-speed memory units in LSI with millions of bits of 
capacity will also be available. All of these LSI units will 
be capable of being produced in large quantity lots at costs 
of, say, less than $100 per processor or memory unit. 

f. Low-Cost Display Terminals A display terminal has 
advantages over the terminal with only a keyboard Ijecause it 
makes possible a much greater^ "bandwidth" between a human user 
and the machine system. A computer can feed back much more 
dato faster, with the use of a display, and this facilitates 
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certain interactive man-machine processes . Costs of display 
terminals are continuing to become more reasonable. Where 
costs of a dii^play terminal were at the $25 , 000-$50 , 000 level a 
short time aiv^ . r.il. are available at about $2000 and will 

probably level oil. vwontually at about the cost: of a typewriter. 
The availability of low-cost display terminals has a major re- 
lationship to the design and utilization of the machine process- 
ing system. 

It should be noted that many particular processing functions 
can be incorporated in a display terminal so that it becomes a 
kind of small specialized processor. These terminals have been 
designated as "intelligent terminals'' and are significant because 
they provide a way to relieve a central support system of a pro- 
cessing load which might be required because of a specialized ap* 
plication. 

2. THE POTENTIAL OF TECHNOLOGY ADVANCES FOR NASIC SEARCH 
CENTER OPERATION 

There is no disagreement that many interesting technologi* 
cal developments are coming along and that these will at some 
time be put to useful applications. However, there is a dif- 
ference of opinion as to how much a service center, such as 
NASiC, which is just beginning operation, should pay attention 
to evolving technology. The argument could be made that NASIC 
should get under way using established procedures and tools in 
order to get a beginning level of service started even if this 
is largely a minimal activity. There is another vievTpoint which 
says that much of the newly developing technology is already at 



a stage where it can be applied and adapted and chat what is 
needed is a clear specification of system needs and require- 
ments in order to get on with the adaptation of the new tech- 
nology. This viewpoint also holds that costs of application 
and adaptation will be within reasonable economic bounds if 
the benefits and the increased system outputs and services 
are taken into account. There is not much to be gained by 
further discussion of these different viewpoints but there 
is some value in pointing out that NASIC would tend to gain 
appreciably in the quality and capacity of its operation if 
some of the projected technological goals are fully realized. 

In the future* improved communications would include low- 
er costs, high-speed wide band-width channels > automated switch- 
ing, networking and interconnection processes. Improvements in 
data storage capabilities would include lower cost memories, 
memories of massive capacity* high data transfer rates* automa- 
ted file organization. Improved on-line processing would include 
low- cost display terminals * stand-alone display processors * a 
wide-spread proliferation of terminals* improved integrated op- 
erating systems, distributed processing networks* low-cost, 
small size mini and micro processors- What do these developments 
mean for a NASIC service center operation? What potential do 
they represent? 

The availability of greater capacity for making storage at 
lower costs means an ability to replicate holdings* to store 
massive libraries of data and information at a local center. 
This means a new set of options to provide faster processing 
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and service. Low cost microprocessors provide new opportuni* 
ties to do multiple searches and processing operations to ac- 
commodate larger volumes of queries for more system uses. The 
facilitation of on-line capabilities provides new levels of ac- 
cessibility. Not only will more people with information needs have 
the ability to explore the holdings and services of the center, but 
also there will be new ways to provide delivery of search 
results and document texts. It all adds up to new potentials 
to provide higher levels of accessibility and an improved 
ability to handle higher volumes with fast response at costs 
which will be acceptable to the user population. 
3. IMPLICATION FOR NASIC PLANNING POLICY 
The preceding discussion regarding technology change and 
its potential for NASIC has been intended to underline the fact 
that the essential tools which NASIC needs for its operations 
are in fact undergoing significant change. In these circum- 
stances it Is Imporuant to emphasize that NASIC should not use 
its resources to establish and elaborate procedures and process- 
es which are certain to be altered or eliminated in a fairly 
short time period. Where new approaches and tools with a pro- 
spect for stability are proven to be satisfactory and reliable 
they should be used and exploited. 

It may be advantageous for NASIC to set up certain exper- 
imental subsystems in order to provide a reasonable interplay 
with new and evolving capabilities. For instance ^ it would be 
possible to select some segment of the user population which has 
a clearly defined set of data and Information needs. Experiment- 
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al services could be provided using all possible tools including 
newer methods to give the most rapid response, the best coverage, 

and the fastest document delivery. It misht be imoortant to spf iip 
direct communication lines with selected users to test certain 
on-line interactive interrogation and searching of selected 
files , 

Consideration should be given to the use of the NASIC op- 
eration as a test-bed situation where new developments could 
be tried and tested. An active part of the NASIC program 
should include a certain amount of on-going research and de- 
velopment pointed toward the achievement of NASIC's basic goals. 
There is no reason why R&D effort could not reinforce ^nd sup- 
plement the primary objective of NASIC to serve the Northeast 
Science Community using existing methods and sources* 

In setting up a new operation it would be an unwise ap- 
proach to concentrate wholly on an operation which would employ 
only ''state-of-the-art*' methods and means. It is doubtful 
that any operation based on this approach would be able to 
meet requirements for coverage , speed of response and quality 
of service especially if a sizeable user population is to be 
served, NASIC will want to anticipate that its operation, 
after a period of initiation and growth, will be handling a 
large volume of queries and responses. It will be serving a 
large user world and will have to move in the direction of 
automation of its handlings search , processing and delivery 
processes in order to give a continuing level of satisfying 
service- 
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M. SOME MANAGEMENT ASPECTS OF AN INFORMATION SERVICE CEN- 
TER OPEEtATION 

An information service center as proposed in the NASIC 
program is intended to serve as a kind of brokerage and inter- 
face between individuals in the science conununity who have in- 
formation needs* and the sources* data bases and services which 
can serve these needs. By the aggregation of data bases and 
services and the providing of access to services at one central 
point* it is intended that certain advantages be realized for 
the center operation as well as the potential asers . 

If the objectives for NASIC are to be attained, it will be 
essential that NASIC management be very clear about the kind of 
user market it intends to serve* as well as the nature of the 
service it intends to provide. In the following ^discussion we 
believe it is useful to review what options NASIC may wish to 
exercise 

Objectives for Continuing Support The NASIC project is in a 
beginning phase. The general guidelines which have been set up 
emphasize the desirability of achieving a non-subsidized, paying 
operation within a three year period. A major obj ecti\" ^ , then* is 
to provide services which will attract customers who are able to 
pay the costs of the services given- This is of prime importance 
because * without the prospect of subsidization after the three -year 
period* the very survival of a NASIC operation is at stake. In its 
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decisions about kind of service and the choice of user clientele, 
the need then to generate a paying operation will be a major factor. 

User Market . It is essential that NASIC make an early dec oion 
about the user clientele it intends to serve. The charter for the 
NASIC project does make clear that NASIC has a prime responsibility 
to the academic science community. Students and faculty members 
of the many colleges and universities of the Northeast will be 
interested in NASIC services. There is a question, however, whether 
many individuals in the academic world will be willing or able to 
pay for the services which are offered. Institutional contribu- 
tions will undoubtedly be an important source of NASIC funding when 
NASIC has demonstrated its usefulness. However, it is very doubt- 
ful that this support alone can provide what NASIC needs. It is 
almost inescapable that NASIC will have to point to the commercial 
and industrial elements of the science community as a primary 
marketing obj ective , 

Kind of Service. An information services center will provide 
a spectrum of data and information services as described in earlier 
sections of this report- To achieve a viable operation which will 
be attractive to potential users in the science conraiunity, it is 
important that NASIC adopt a clear policy regarding the kind of 
service it intends to give. it is not sufficient to assemble all 
available data bases and information files together with a means 
to access them, and then expect customers to approach the service 
center to use what has been assembled. Instead, it is important that 
the service center management take steps to understand what the 
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information need is of the typical user in his environment » and then 
take steps to focus the service center operation to meet that need. 
It will be profitable to spend time and effort to identify as 
clearly as possible what the information and data needs of indivi- 
duals are and then adjust the objectives of the service center opera- 
tion on the basis of this knowledge. There will be opportunities 
to be selective about the choice of data bases and sources. There 
will be options about kind of access and document delivery. What 
is to be provided by the center should be the best service available 
to meet in the best and fastest way a real and unsatisfied customer 
need . 

Promotion . It is a reasonable extension of the preceding dis" 
cussion to emphasize that NASIC management will wish to give attentive 
consideration to all kinds of means to promote NASIC services. It is 
the case that there are many situations in both the academic and 
commercial areas of the science conununity where data and information 
requirements exist and are not being served. NASIC will want 

to publicize vigorously what services and data sources it can make 
available and the convenient ways to access these services. A strong 
promotional program will be a necessity to assist in making a NASIC 
center a viable operation. 
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Introduction 



This report is based largely on an important 
and unique workshop held by the Institute for 
Communication Research, Stanford University, 
April 23-25, 1973. The Institute has a contract 
with the National Science Foundation to conduct a 
comprehensive, comparative analysis of some major 
on-line, interactive systems for bibliographic 
searching in use in the United States at the present 
time. The workjhop itself was considered an important 
element i : ^lis overall evaluation. 

The principal investigators are Ed Parker and 
Tom Martin of the Institute for Communication 
Research. A committee of **experts" has been formed 
to provide assistance and guidance in the conduct 
of the study. The members of this committee, which 
includes the present writer, are listed in Appendix 1. 
Members of this committee acted as moderators of 
the five sessions of the workshop. Eleven systems 
are being studied. Brief ^'snapshots" of these systems 
are given in Appendix 2 and the system representatives 
attending the workshop are listed in Appendix 3, along 
with other observers who were invited to the sessions. 

The eleven systems include the major bibliographic 
searching systems now in use with two notable exceptions: 
the Mew York Times Information Bank and the INQUIRE 
system. It was not possible to include the former ^ 
because of the proprietary nature ok the system, 
which was designed by IBM exclusively for use by 
the Mew York Times . The owners of INQUIRE chose not 
to participate in the study. 

While the study is largely concentrated upon 
general-purpose bibliographic seaxchLng systems, 
some special-purpose systems were included for 
purposes of comparison. One special-purpose system 
is RIQS, the Remote Information Query System, 
developed at northwestern University* RlQS is designed 
largely to allow individual academic researchers to set 
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upr manipulate and query their own personal 
collections of documents, data, or bibliographic 
references. RlQS may be regarded as both a 
bibliographic retrieval system and a management 
information system* The NASIS system^ operated by 
the NASA Lewis Research Center, is designed to be 
compatible with RECON and DIALOG* The major NASIS 
data base is a large collection of aerial photographs r 
but other data bases, including a bibliographic 
data base from the Nuclear Safety information Center 
and some management data bases r are also included 
in the overall NASIS system. All of the systems 
represented, with the exception of LEADER, are 
designed to respond to conventional Boolean 
searching strategies* LEADER operates in a completely 
different way* Document representations in LEADER 
consist of noun phrases r extracted from text by computer* 
The system may be queried by means of a request in 
the form of an English sentence* This sentence 
is matched against the stored document representations 
(in noun phrase form) and those representations that 
best match the request are displayed for the user 
(i*e*# the search process is essentially a pattern 
matching operation)* DATA CENTRAL, STAIRS^ LEADER 
and INTREX were designed primarily to operate with 
natural language, ORBIT^ DIALOG, BASIS-70 and RECON 
were designed primarily to operate with controlled 
vocabularies, but each has been modified, or is 
in the process of being modified, to allow searching 
in a text mode. The other systems operate with text 
or limited vocabulary, dependingupon specific 
applications, RIQS has no capability for constructing 
and operating on inverted files* 

^ The system "snapshots" of Appendix 2 present 
basic features of these systems in a standard format* 
For each the following characteristics are given: 

(1) the type of data base handled, 

(2) whether or not the system has a 
text search capability. 
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(3) system availability (accessibility) , 

(4) type of user, 

(5) type of terminal, 

(6) type of service offered, 

(7) whether the terminals are in a central 
location (i.e*, in a center of service) 
or distributed remotely (personal 
terminals) , 

(8) type of charges applied, 

(9) prograrmning language used, and 

(10) hardware for which the system is 
designed. 



The workshop itself was unique in that it brought 
together representatives of eleven systems, including 
competitors, in order to exchange views and experiences* 
The major objective of the symposium was to try to 
arrive at a set of "minimal features" that all 
participants agreed were essential to the operation 
of an efficient, viable interactive system. The 
meeting was remarkably effective in that agreement 
was generally reached on most of the features discussed* 
All of the systems, with the exception of LEADER, 
were demonstrated at the meeting. 

The meeting was divided into six sessions, each 
with its own moderator, devoted to the following 
topics : 



(1) 



The searcher/task environir,3nt 



, (2) 



The data base environment 



(3) 



Search negotiation features 
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(4) Display and secondary support 
features 

(5) Instructional and diagnostic 
features 

(6) Overview and future directions 



The later sessions of the workshop brought 
forth detailed and constructive discussion and they 
are worth summarizing here for that reason* The 
summaries of these sessions presented the most 
complete and up-to-date data now available on the 
characteristics and capabilities of present-day 
on-line , interactive retrieval systems * 
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THE SEARCHER/TASK ENVIRONMENT 



This was the subject of the first session of 
the symposium. The discussion was general and 
exploratory. Participants were feeling their way 
only. The moderator, Nance, pointed out that the 
published literature on the development and operation 
of on-line systems for information retrieval is very 
sparse. He also suggested that most of the eleven 
systems represented at the symposium were developed 
in relative isolation, without direct reference to 
other existing systems. If many systems, developed 
independently , Incorporate very s imilar features 
it would suggest that there is some evidence of 
consensus on the desirability of or need for the 
features. An important function of the symposium 
was to determine how much agreement exists as to 
which features should be regarded as "minimal" and 
which should be regarded as desirable, although 
not "minimal," 

The later sessions of the workshop^ which got 
down to details and which form the real heart of this 
report, did explore the degree of consensus existing 
among the various participants. 
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THE DATA BASE ENVIRONMENT 



This second session, chaired by Belzer, also, 
dealt largely in generalities. It attempted to 
explore the question of how much the user needs 
to know about the data base he wishes to interrogate. 
Each representative described various characteristics 
of data bases handled by his system. This portion 
of the symposium presented necessary background 
for the one and one-half days of discussion that 
followed, but it is not worth reporting in detail 
here . 
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SEARCH NEGOTIATION FEATURES 



The major search negotiation features are 
presented in Table 1/ with various symbols supplied 
to indicate which systems have which features. 
The starred (*) features were regarded by the majority 
of participants to be "minimal features" (i.e./ 
features withour which an on-line^ interactive system 
is unlikely to function effectively)* 

Request sets 

This refers to the capability of assigning a 
number to each line of a search statement (e.g./ a 
string of terms in a Boolean OR relationship) and 
allowing the building of complete and complex search 
strategies by incorporating logical sets by line number^ 
as in the following example: 

1. A or B or C or D or E 

2* Q or R or S 

3* 1 and 2 

4* 3 and not. M 

There was general agreement that this capability should 
be regarded as minimal* Only the representative of 
LEADER/ which operates in a non-Boolean search mode/ 
disagreed. 

Dictionary access 

This refers to the capability of displaying/ 
in alphabetical sections/ the "vocabulary" of the system/ 
whether the vocabulary be a controlled vocabulary 
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(subject headings or a thesaurus), text words, or 
indexing phrases (as in LEADER or INTREX) , As 
examples, the NEIGHBOR command in ORBIT and the 
EXPAND command in RECON will cause the terms 
alphabetically adjacent to the input terms to 
be displayed. The alphabetical diiiplay is generated 
even if the input term does not appear in the data 
base (i.e., the terms closest to it alphabetically 
are displayed); this allows the system to compensate,* 
to a certain extent at least, for some spelling or 
typing errors. This is perhaps particularly useful 
for names of authors. In DATA CENTRAL and SPIRES 
the alphabetical display does not show the number 
of postings associated with each term in the display; 
in the other systems having the dictionary access 
capability (see Table 1) tallies (postings) are 
displayed. 

Related term capability 

This refers to the ability to request, for any 
search term entered, that the systems display terms 
that are semantically related (e,g,, see also or RT 
relationships) , Clearly, this feature implies a 
system that makes use of a controlled vocabulary, 
although LEADER has its own form of related term 
capability (it will display phrases that most closely 
match a sentence input by the searcher). In DIALOG 
the feature is turned on automatically. That is, when 
a searcher enters a term the related terms are 
displayed automatically. In other systems the 
searcher requests the related terms display, DIALOG, 
RECON and LEADER number each term, so that they may 
be incorporated into the search strategy by number, 
STAIRS automatically adds each related term into 
the request set unless overridden by the searcher. 
This feature was regarded a?> very useful but not 
"minimal" by most participants. 
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Hierarchical thesaurus 



This relates to the capability of being able 
to display a portion of a hierarchical thesaurus, 
where such a highly structured thesaurus exists 
in the system. For example, the command TREE in 
ORBIT will cause a display of the complete hierarchy 
in which a particular term appears. Again, regarded 
by most participants as desirable but not minimal. 

Incorporation of synonym tables or term hierarchies 

This is a related capability. It refers to 
the ability to incorporate easily into a search 
strategy a complete block of related terms. The 
best example is probably the use or EXPLODE in ORBIT, 
which causes the complete hierarchy beneath the 
"exploded" term to be incorporated into a search 
strategy. In DIALOG and RECON, the searcher can 
incorporate a whole block of displayed terms by using 
a "range" indicator (e.g., E 05 - E 17). In DATA 
CENTRAL stored synonym tables can be pulled into a 
strategy by unique identifying number. In STAIRS 
the searcher is given an automatic synonym expansion 
capability. That is, all synonyms stored by the system 
will automatically be pulled in when the user enters 
a search term, unless this feature has been overridden 
by the searcher. 

Search field control 

This was regarded as a minimal feature ^ especially 
for systems searching highly formatted records. It 
refers to the capability of specifying which field 
(e.g., title, abstract) in a record is to be searched. 
An X in the Search Field Control column of Table 1 
means that the system will search all fields unless 
the user specifies that a particular field is to 
be searched. A v/ means that the default condition 
assumes that some particular field (e.g., descriptors) 
is to be searched unless the user specifies otherwise. 
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Boolean operators 



The ability to combine terms using the Boolean 
AND, OR and NOT operators is a minimal feature for all 
systems except LEADER, which functions in a completely 
different search made. In ORBIT, DIALOG, RECON, 
BASIS-70, NASIS and RIQS the logical AND is processed 
before the logical OR, DATA CENTRAL processes AND 
before OR, but processes OR before In STAIRS and 
SPIRES the leftmost operator , whichever it is^ is 
processed first. In INTREX no compound statements 
(both AND and OR in a single statenent) are allowed. 
There was general agreement that some form of 
standardization of Boolean operators was needed, 
particularly to allow an on-line user to move freely 
from one system to another. For the transfer from 
one system to another in a completely mechanized 
way the problem does not exist because mapping 
algorithms can be developed to ^ake care of system 
differences of this type. 

Word proximity operators 

This relates to the capability of being able 
to specify that a particular combiration of terms ^ 
on which a search is conducted, should occur within 
a specified distance of each other in the document 
record. This distance may be specified by structural 
unit (same paragraph, same sentence) or by actual 
word distance (two words must occur immediately 
adjacent or no more than x words apart) , The feature 
is clearly minimal for full text systems but not for 
systems operating on sets of index terms. It was 
generally agreed that, for most purposes ^ strict 
word distance was probably the best method to achieve this 
system capability. Very little need has been found 
to specify word co-occurrences within linguistic 
units (paragraph, sentence). All systems represented 
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here allow exact phrase matching (i.e., two or more 
words must occur in adjacent positions in text) . 
In ORBIT and SPIRES this exact phrase matching 
is achieved by sequential search. Only DIALOG* 
:.£CON and DATA CENTRAL allow the user to specify 
how many words may separate two search terms. 

Arithmetic operators 

This capability involves the use of GREATER 
THAN, LESS THANr and BETWEEN operators. It is 
particularly valuable for systems handling 
numerical data (e.g., dollar values) and was 
regarded as a minimal feature by most participants. 
In STAIRS the arithmetic operators can only be used 
when performing a sequential search. 

Suffix removal 

There was general agreement that an interactive 
system^ particularly one handling full text, should 
allow suffix removal (truncation). Usually suffix 
removal is under the control of the user. He enters 
any root and follows it with some truncation code. 
In INTREXr however, search terms are automatically 
"stemmed" unless the term is followed by an 
exclamation point . For example , the term ''surgery*' 
is autOTTiatically reduced to its root* "surg...*', 
and will retrieve all words with this root, whereas 
"surgeryl " retrieves only this particular character 
string. In DATA CENTRAL a final *'s'* on a word is auto- 
matically disregarded. 

It is also desirable in many situations to be 
able to search on strings of characters other than 
prefixes (i.e. , suffixes or infixes) . Searching 
on suffixes can be a very powerful device in a 
natural language system. This type of search 
presents no particular problem in a batch processing * 
serial searching operation. It does # however , present 
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problems in an on-line, random access system based 
on inverted files- It would certainly be extremely 
expensive to maintain inverted files for all suffixes, 
as well as. prefixes. However, there is no real 
reason why inverted files could not be set up for 
selected suffixes that are known to be especially 
valuable in searching particular collections 
(e.g. , . . .MYCIN) . 

Phrase decomposition 

This is an unusual feature possessed by only 
three of the eleven systems. It involves the 
decomposition of a natural language phrase into 
"significant" words, ignoring nonsubstantive words. 
In INTREX a logical AND relationship is assumed 
between the remaining words. In SPIRES the assumed 
relationship may be OR or AND, depending upon the 
file being searched. 

Data base partitioning 

This relates to the capability of partitioning 
the data base, for search purposes, so that a particular 
strategy can be applied only tc a limited portion of 
the data base (e.g., specific years, specific types 
of documents). A somewhat related capability, possessed 
by STAIRS, is the ability to concatenate data bases. 
For example, the STAIRS user has the option of 
specifying that his search strategy is to be applied 
across several data bases existing in the system. 

Sequential searching • 

* This refers to the capability of being able to 
search on portions of the record for which inverted 
files have not been created. Typically, a system 
may conduct most of its searches on inverted files 
but, once a particular set of documents has been 
identified in this way, the system will allow the 
sequential searching of these records (e.g., a title 
scan)- In RlQs all searching is sequential. 
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Profile storage 



This feature allows a user to store a search 
strategy within the system. This pre-stored strategy 
can be called up and displayed or used to search 
against full or partial data bases contained within 
the system. The feature is most valuable, of course, 
in providing the capability for SDI (selective 
dissemination of information) : the user can store 
a search strategy, representing his profile of current 
interests, within the system and simply visit the 
terminal, periodically (say monthly) in order to 
find references to relevant items added to the system 
since last he consulted it. 
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DISPLAY AND SECONDARY SUPPORT FEATURES 



The major features in this category are shown 
in Table 2, which also indicates which systems 
possess which features, Again# the "minimal features" 
are denoted by an asterisk. 



Search review 

This feature, regarded as a minimal feature 
by participants, allows the user to review the status 
of his search. In response to some type of **recap" 
command the system will display the sets currently 
active, the nxirnber of citations each contains* and 
the strategy that caused each set to be created, 
The feature is obviously not needed in a system such 
as LEADER, which does not operate on request sets 
in the conventional sense. 

Predefined formats 

This also was regarded as a minimal feature by 
most participants. The feature allows the user to 
specify which of reveral standard formats for document 
records he would like to have displayed (e,g,, title 
plus source, full citation plus index terms, title 
plus abstract). This feature is under user control 
only to the extent that he can select any of several 
possible formats. He may not* however* specify new 
formats (i,e,, formats that the system has not 
predefined). The discussion on this feature pointed 
out that the slower the terminal in use the greater 
the need for abbreviated forms of output. 

Field specification 

This is related to the previous feature but places 
the output options more directly under user control. 
That is, systems having this feature allow the user 
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to request a record format consisting of elements 
that he has specified. The record is built up 
according to his specifications and is not predefined 
by the system* For example, the user can request 
a display consisting of report number, title and 
abstract or author, title and abstract. 

Rapid scan 

This capability was regarded as being non-essential 
by most participants, and it is a capability that 
only a few systems claim* The feature allows the 
display of brief records in rapid succession, without 
the need for the user to hit a key to select on a 
record-by-record basis* The display continues until 
interrupted by the searcher* 

Highlighting 

The developers of the Data Central system feel 
that this is a very important feature. Most other 
participants said it was useful but by no means 
essential* It relates to the capability, in a text 
searching system, of indicating where in a record 
the words causing an item to be retrieved appear* In 
Data Central many techniques have been tried* Perhaps 
the most impressive makes use of a color terminal with 
the highlighted words appearing in a color different 
from that of the remaining text* In black-and-white 
terminals other techniques are possible, including 
the use of arrows, asterisks, blinking, variable 
intensity, and the dropping of the highlighted terms 
below the level of the others on a line* 

Expanding 

This relates to the capability of "expanding" 
the length of a document record displayed. For 
example, the searcher could first request title, 
then full citation/ and finally abstract or index 
terms* In certain types of systems he may be able 
to "expand" to full text/ either in digital or 
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microfiche form. The need for an expand capability 
is probably more critical for slower terminals. 
That is, for a fast terminal it requires very little 
more time to display an abstract than tc display 
citations. In fact, a really fast CRT display can 
probably show abbreviated versions fastcir than the 
user can scan them. 

Sorting 

This relates to the capability of sorting the 
set of records that match a particular searai strategy 
and displaying or printing these records in an order 
specified by the searcher. Possible sorting options 
include author, journal title, and publication date 
(in direct or reverse chronological order). As seen 
in Table 2, few of the systems presently offer 
altern^ative sorting capabilities, but several systems 
are in the process of giving this capability to the 
on-line user. 

Ranking 

This is related to sorting. It refers to the 
ability of the system to present records to the on-line 
user in otder of the degree to which they Batch his 
search strategy. Hopefully, a sequence by "degree 
of match" will also approximate a sequence of probable 
relevance. The LEADER system, by its very nature, 
generates a ranked output. STAIRS and ORBIT have 
limited ranking capabilities. Five possible ranking 
algorithms exist in STAIRS, including one based on 
the absolute frequency with which a search term 
appears in a document, one based on the frequency 
of occurrence of a search term in the corpus as 
a whole, and one based on word proximity within 
documents. I personally feel strongly that a ranking 
capability is nesded in on-line, interactive systems, 
but the majority of the participants at this workshop 
did not agree with me, perhaps because their own 
systems do not offer this feature. 
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Computing 



Three systems offer the user some computing 
facility* An example of this feature is a system 
which would allow the user to retrieve descriptions 
of all contracts worth in excess of, say, ^10,000, 
and then do various arithmetic manipulcXtions on 
the financial data (e,g,, sum the values or average 
them) • BASIS^70 offers some limited computing 
feature* Both RIQS and SPIRES interface with 
statistical computation programs and can offer the 
on-line searcher fairly sophisticated capabilities 
for computat ion , 

Microfiche 

This relates to the capability of being able 
to access a remote microfiche file from a searching 
terminal. File addressing is under control of the 
same computer that controls bibliographic searching* 
The user* once he has identified citations of interest 
to him through the bibliographic searching system, 
has the ability to request that microimages of the 
item be displayed at his terminal* The images- 
may be displayed on the CRT that the search is conducted 
on, or thr;*y may be displayed on an adjacent tube* 
INTREX ha^ the most sophisticated microfiche inter- 
face of tins type* BASIS-70 also has the capability 
of interfacing with a microfiche store* DIALOG 
claims the capability but does not use it* Most 
participants agreed that a microfiche interface 
would be a highly desirable feature but that the 
cost of such interfaces at the present time make 
the feature hard to justify for most applications* 

Offline printing 

It was generally agreed that all on-line systems 
should give the user the ability to request than an 
off-line printout be made of all records satisfying 
a particular strategy* All eleven systems have 
this capability* Generally the results are mailed 
to the user at a later time. 
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Display of graphs 



Both BASIS-70 and RIQS include the capability 
of presenting data in the form of graphs. In the 
case of RIQS quite sophisticated plots are made 
for various characteristics of searches that have 
been conducted (e*g<, a typical plot would show how 
searches are distributed in terms of the amount 
of time consumed by each). 

Batch retrieval 

This relates to the capability of entering 
a search strategy at an on-line terminals testing it 
out there^ and then requesting that the strategy 
be used to search some data base in a batch mode. 
Typically, a very large system will maintain only 
a portion of its data base (perhaps the last two 
years) on-line. The remainder of the data base 
can be accessed only in an off-line ^ batch mode. 
The searcher should ^ however ^ be given the ability 
to have his on-line strategy used intact in a search 
of the off-line file. 

Random citation selection 

This feature allows the searcher to specify 
that he would like to see, from the complete set 
of documents that match his search strategy ^ a few 
items selected in a quasi-random manner. Several 
systems have the ability to generate such a quasi- 
random subset* For example^ the ORBIT searcher 
can request that, from x citations that satisfy 
his search requirement, the system is to print 2, 
skip 10, print 2, skip 10^ print 2< 
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INSTRUCTIONAL AND DIAGNOSTIC FEATURES 



I chaired tlriis particular session of the 
workshop. Below is a sunanary of my introduction 
and the framework for discussion that I attempted 
to establish. 

The session deals with two distinct but related 
topics t (a) training of ust^rs > and (b) system 
diagnostic features. Training of users involves 
two phases* (1) initial training (i.e.^ giving 
enough training to allow a person to make some use 
of the system) and (2) "continuing education" 
(i.e,^ methods of raising the performance of 
the searcher^ sMwing him how to use more sophisticated 
techniques and a3;>proaches ^ and informing him of system 
changes^ including new capabilities). 

As in most other areas of endeavor^ there is 
no real substitute for hands-on experience. That 
is ^ people will learn bfci^ how to use on-line 
retrieval systems by practice. However^ the new 
user needs a limited amount of instruction before 
he can ever get onto the system^ even if this 
instruction merely tells him how to log^on. In 
considering the possibilities for training the new 
user^ it is well to remember that two types of user 
may be involved (i,e,^ librarians or other information 
specialists on the one hand^ and "end users" on 
the other) and that methods suitable for the training 
of one type may not be suitable or feasible to use 
in the training of the other. 

It appears that there are at least four 
possible approaches to initial training: 

1, Personal instruction . This involves 
a one-to-one relationship between the instructor 
and the "student," The student learns by using the 
terminal with the instructor at his shoulder. The 
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situation is probably ideal in many ways. Unfortunately 
it is very expensive and therefore not suitable for 
training of large numbers of potential users. It has 
other limitations too. If potential users are 
widespread^ it is unlikely to be feasible to instruct: 
them in this way. It is most likely to be feasible 
in the training of a few users in a central location. 

2. Classroom instruction .. This involves 
a one-to-many relationship between the instructor 
and the potential users. A classroom session^ or 
series of classroom sessions ^ including demonstrations ^ 
is followed by periods allowing hands-on experience. 
The approach is less personalized than the one-to-one 
relationship but is also less expensive and more 
practical for the training of large numbers of users. 
It retains the advantage of providing a "live" 
instructor to answer questions and to assist with 
problems encountered in the training sessions. The 
classroom approach has been used successfully by NLM 

in the training of MEDLINE search analysts and has been 
used successfully by Informatics in training users 
of TOXICON, an application of RECON. 

3. Printed instructional materials . There 
are at least three types of these: (a) user manuals f . 
(b) summaries of basic capabilities, and (c) on^-line 
printed instruction. 

A complete user manual, describing all 
system features in some detail ^ is certainly needed. 
However^ such a manual is more useful as a reference 
tool than as a training aid. The typical user manual 
is much too detailed to be suitable for use in the 
training of a new user; it gives him more than he 
can possibly absorb and more thac he really needs 
as a beginning user, if a user aanual is intended 
to be used for instruction^ it needs to be clearly 
divided into easily digestible sections or lessons- 
The first section or lesson merely tells how to 
log-on to the system and presents only enough 
commands and capabilities to allov a user to conduct 
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a relative3-y simple search. When he has mastered 
thisr and proved it at the terminal ^ he moves on 
to the next lesson, which brings in new commands 
and more sophisticated procedures. He proceeds 
in this way until he has mastered the full system 
capabilities. 

An improvement on the conventional 
user manual, for training purposes, is a brochure 
summarizing basic system capabilities and including, 
perhaps, a sample search. A number Qf such 
pxablications exist for various systems, including 
DATA CENTRAL and DIALOG. 

On-line instruction is offered by several 
systems, including ORBIT, DATA CENTRAL and IKTREX. 
In the MEDLINE application of ORBIT, for example, 
once the user has logged in he is asked to identify 
himself as a new or experienced searcher. The "new" 
searcher is given the opportunity to receive 
instruction in how to use the system at the on-line 
terminal* The instruction given may be in abbreviated 
or ' ^ j^ovm, DATA CENTRAL includes some quite 
? -^phistic^xed instructional capabilities, designed 
for CRT presentation, and these take the user through 
sample searches. 

The subject of on-line instruction 
generated some discussion. Those systems that provide 
such facilities are strongly in favor of them, while 
those who do not are against the use of on-line 
connect time for training purposes. Their attitude 
is "why use valuable on-line time to tell the user 
something he can just as easily read in a printed 
manual?" This viewpoint has some justification. 
Unfortunately, many people are very reluctant to 
read printed instructional materials, and are more 
likely to accept the on-line tutorial because of 
its novelty* The investigators of Project INTREX 
discovered that users preferred on-line instruction, 
even when the on-line instruction was identical to 
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instruction available in printed form. Moreover, 
a well designed on**line tutorial, especially one 
using a CRT, can be much more effective than a 
printed tool in giving the user "the feel" of the 
on-line system. This is particularly true when 
the tutorial takes the user through one or more 
sample searches, as in the case of the DATA CENTRAL 
system. 

4 . Audiovisual instruction . This form ^ 
of instruction is probably the one most neglected. 
A few script-slide presentations exist (e.g*, on 
DIALOG) but they are limited in scope. Script-slide 
presentations are essentially static and a more 
dynamic approach is clearly desirable* One 
possibility which seems to have considerable promise 
is the use of videotape. A well-designed videotape 
might prove to be a reasonably good substitute for 
personal instruction. The videotape would be used 
to show an experienced searcher at work at the terminal. 
The viewer would be looking over the shoulder of 
this searcher. With audio commentary, the searcher 
would explain how to log**on, how to conduct a simple 
search, how to display results, and so on. A second 
videotape could be used at a later time to introduce 
more sophisticated searching techniques* As far 
as I am aware, no one is using videotape for training 
in how to use on**line bibliographic systems* 

The continuing education of the user presents 
different problems and challenges* lliere are also 
many more facets to continuing education* Within 
"continuing education" we can legitimately consider 
all forms of help given to the user by the system, 
all built-in tutorial features, and perhaps all the 
searching aids that the system provides* The object 
of this aspect of training is to increase the level 
of the user's performance, to help him when he goes 
astray, and to inform him of new system capabilities* 
Some possible "continuing education" features are; 



(a) EXPLAIN. Several systems have a 
feature of this type^ That is, they provide an 
"explain" command by which the user can obtain (on 
demand) explanations of system commands, error 
messages, and other system features. 

(b) HELP. Help to the on-line searcher 
who finds himself in difficulties can be provided 

in several ways* First, in some systems (e.gw ORBIT) 
the command HELP will bring to the user any one of 
several pre-stored ''solutions" to problems commonly 
encountered. The user is first presented with a 
list of commonly encountered problems. He selects 
from this list the problem that is applicable 
to his own situation and the system provides him 
with a generalized statement on how to proceed. 
This type of help is reasonably satisfactory as 
long as it is possible to identify the major 
problems likely to occur in the system, to label 
these problems in a way that allows the user to 
recognize them, and to present solutions that are 
clear to the terminal user. Obviously, it is not 
possible to anticipate all types of problems that 
might occur and treat them in this standardized form. 

A better approach is to use the HELP 
command to bring in an experienced searcher at 
another terminal. Communication between experienced 
and inexperienced searchers is by way of these 
terminals. This is more personalized but it is 
also more expensive. It requires that a trained 
searcher be at an on-line monitor at all times, to 
await HELP messages and handle them accordingly. 
In this situation much searcher time is spent in 
waiting and it is unlikely that he or she could be 
very productive on other things while "on call" in 
this way, A second terminal has to be dedicated 
to the "help" mode if this method of offering aid 
is adopted. 
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It was generally agreed that the best 
approach to **help" is to provide it in live form 
by giving the terminal user a special telephone 
number to use when assistance is needed* The 
telephone number puts him in touch with a fully 
experienced searcher, who can help him with his 
specific problem. It was felt by most participants 
that this approach is much better than the on-line 
communication for two reasons: (a) it is easier 
to communicate and to identify the problem, and its 
solution, by telephone, and (b) most users prefer 
to deal directly with another human rather than 
through the medium of the machine. 

(c) More advanced and sophisticated system 
techniques . The novice user should have some 
opportunity of learning more sophisticated procedures 
either by moving to a more advanced "lesson" in aii 
instruction manual or by calling up a more advanced 
tutorial at the terminal . 

(d) New system features. The system 
must provide some mechanism whereby its users are 
kept current with new development.^:* and new system 
capabilities. One method of achieving this, used 
by MEDLINE among other systems, is to have a NEWS 
command. When the user enters this command he is 
informed of recent changes in the data base, in the 
command language, in the error messages, and so on. 
Several systems also publish a newsletter to keep 
users informed of new developments of this type. 

(e) Other aids to the searcher* Although 
not strictly instructional, a related set of system 
features are designed to make the system easier for 
the searcher to use. "Ease of use" features include 
features designed to reduce the likelihood of error 
and featiires designed to reduce the amount of keying 
that needs to be done. Features of this type include 
display of '*menus" from which the user makes a 
selection; acceptance of abbreviations # including 
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abbreviations for command nanies; acceptance of 
common misspellings, including spelling errors in 
command names (BASIS-70, for example, will accept 
common variants) and index terms (RECON and DIALOG 
accept spelling errors in index terms in the sense 
that they display terms that are alphabetically 
close to the misspelled term); and the RENAME 
command (as in ORBIT) which allows a user to change 
command names, if he wishes, to a form with which 
he is more familiar {e.g., because he uses these 
names in another system) . 

In concluding my introduction to the subject of 
instructional features I raised the follov/ing additional 
points : 

1. That instruction of users of on-line 
systems involves three distinct aspects: (a) mechanics 
of using the terminal , (b) construction of logically 
sound searching strategies, and (c) internal 
"intellectual'" factors, largely associated with the 
characteristics of the particular data base involved, 
I suggested that these aspects , from the viewpoint 
of instruction, are in order of increasing complexity. 
It is relatively easy to train people in how to use 
the terminal fi.e, , pure mechanics) . It is somewhat 
more difficult to teach them how to reduce their search 
requirements to the form of a Boolean strategy. 
Probably most difficult, however, is the problem of 
informing users on the characteristics of the data base. 
This is particularly true where the system is based 
upon a large controlled vocabulary of terms assigned 
to documents by trained indexers. To use the system 
most effectively the searcher needs to know something 
of indexing policies and protocols. He also needs 
to understand the controlled vocabulary and be 
able to find his way around it. it is not easy to 
impart this information to the user. It takes several 
weeks, for example, to train an indexer at the National 
Library of Medicine. Clearly, the biomedical 
practitioner using MEDLINE cannot learn all the nuances 
of indexing and system vocabulary. It is for this 
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reason that systems based on uncontrolled vocabularies 
(i.e./ natural language) are likely to be used more 
effectively by the person who is not an information 
specialist. Training these users in how to exploit 
natural language systems is also likely to be easier 
than training them how to use controlled vocabulary 
systems effectively. 

2. Possibilities for training on-line 
are limited by the type of terminal in use. A CRT 
or other form of video display seems to offer much 
greater scope for an imaginative, dynamic instructional 
approach than a typewriter terminal does. 

3. We may not have fully explored the 
possibilities for applying techniques of CAI 
(computer-aided instruction) in training users . 
Quite sophisticated CAI programs exist in many other 
areas of endeavor^ Such techniques could be used 
either in initial instruction of the potential user 
or in actually ^'taking him by the hand" and leading 
him through the conduct of a search. Some work on 
CAI techniques applied to the training of users of 
on-line retrieval systems has been carried out by 
Caruso at the University of Pittsburgh and by Lyman, 
working with the PLATO system at the University of 
Illinois. 

4. A question worth raising is "should 

we be spending more time and effort in seeking techniques 
to make on-line retrieval systems easier to use?" 
For example, do we need to make such systems less 
sensitive to simple errors of spacing, punctuation, 
and spelling? Or should we adopt the attitude 
that present systems are adequate, from a human 
factors standpoint, and that users should be required 
to be absolutely accurate? This appears to be a 
somewhat controversial question^ Some believe that 
present systems are easy enough to use, even by the 
person who is not an information or data processing 
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specialist/ while others believe that on-line systems 
must be made simpler and more "forgiving" if they 
are to be used extensively by scientists and other 
professionals . 

Some major instructional and diagnostic 
features of the eleven systems are shown in Table 3. 
Most are self-explanatory. 

Manuals 

All eleven systems provide some form of user 
manual to explain (1) how to use a terminal/ 

(2) hov; to log on and off the .retrieval system/ 

(3) what conunands are available and how thoy work/ 

(4) what data bases are accessible and hew they 
are structured/ (5) how to develop a good search 
strategy/ and (6) what to do when X happens. These 
manuals are of variable quality. Some are complete 
and quite readable and others are inconiplete and/or 
virtually unreadable . 

Modes of Instruction 

The major approach Ces) taken when educating the 
first-time user. 



SYSTEM 



the system is programmed to 
help in the training 



CLASS 



live or filmed courses are 
given periodically or upon 
request 



PERSONAL 



an expert sits down with the 
user at the terminal 



READING 



the manual is read 
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On-Line Training 



The user may use the terminal itself to obtain 
information concerning (l) terminal and typing 
problems, (2) the contmand repertoire, (3) characteristics 
of the data basei (4) hints for good searching, 
(5) conunon pitfalls and their remedies. 



ORBIT - The beginning user easily 

falls into this material. 

DATA CENTRAL - The beginning user easily 
falls into this material. 
When in verbose mode, 
system responses cue the 
user. 



SPIRES 



RIQS 



The beginning user easily 
falls into this material- 
He can go through a sample 
dialog annotated with 
explanations. 

The knowledgeable user can 
attach a tutor to the 
retrieval system. The 
tutor is the unique source 
of command and error 
explanations « 



INTREX 



When in verbose mode, 
system responses cue the 
user. 



LEADER 



When in verbose mode (the 
only mode) I system 
responses cue the user* 
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Data Base Overview 



Information concerning (1) the size of the 
data base, (2) the types of material in itf (3) the 
time span covered by the material/ (4) the fields 
making up a record, (5) the searchable fields, and 
(6) special strategies to use or pitf: to avoid 
with the data base. This type of int< \:ion is 
available on-line in the systems check in this 
column. 

Sample Searches 

Three of the systems allow the storage of 
searching strategies. The user, if he knows of the 
existence of a prestored strategy, can call it up 
and incorporate it into his own search formulation. 
These prestored strategies can include search logic. 
Unfortunately, the searcher must know of the existence 
of these strategies in order to be able to use them. 
If he enters a particular term that is the name of 
a prestored strategy, he is not automatically notified 
of its existence. 

On-Line Documentation 

"One page" descriptions of all commands and 
error messages, available on-line. For example, 
the conunand EXPLAIN in ORBIT will generate this 
type of description. 

Search Logic Tracing 

The searcher can request a detailed description 
of how his multi-part search request led to the number 
of hits reported. 

Live Help 

Either a telephone number to call if desperate 
or a message command for requesting help from the 
on-line human consultant. 
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Vest Pocket Card 



A durable folder containing command names and 
an explanation of how to get complete command 
descriptions. May be a card attached directly to 
the terminal , 



Diagnostic Features 

Diagnostic features are incorporated into the 
system to determine failures and problems occurring, 
and thus to suggest possible corrective action. The 
only real way to obtain detailed information on 
user successes and failures, and factors affecting 
performance of the system, is by a fairly extensive 
evaluation based on a sample of actual searches. 
We were not particularly concerned with this type 
of one-time evaluation at this meeting. Rather, 
we were concerned with procedures that might be used 
to analyze the system routinely on a continuous basis. 

One approach is the on-line questionnaire that 
the user completes at the end of his search. Such 
a questionnaire can be used to determine various 
characteristics of the uGer, purpose of the search, 
and the user's subjective assessment of its value 
to him. This, coupled with the connect time for the 
search, gives some general idea of its characteristics. 
The on-line questionnaire has some value in revealing 
who the users are, for what purpose they use the system, 
and their degree of satisfaction with the search results. 

Both MEDLINE and RECON have made use of on-line 
questionnaires of this type, but both systems have 
since abandoned this feature. In the case of RECON 
this decision was made because it was found that 
very few users completed the questionnaire anyway. 
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Another possible diagnostic element is the 
"comments" feature, whereby a user can make suggestions 
for improvements to the system or record problems 
he has had in using it, by typing these conunents 
at the terminal at the time the problem arises or the 
idea springs to his mind* These messages are analyzed 
by the system managers at some Tater time* As seen 
from Table 3, several of the eleven systems have 
this feature- 
By far the most useful "diagnostic*' feature, 
however, is the capability for on-line monitoring 
of what users are doing* Several of the systems do 
conduct various monitoring operations and summarize 
the results in a "monitor log" (Table 3). With this 
type of on-line monitoring of use we are given the 
possibility of learning more than ever before about 
how scientists and others use information retrieval 
systems, and v;ith what degree of success* Unfortunately, 
monitoring of on-line users raises certain ethical 
and possibly even legal problems* Monitoring of 
a user without his express permission raight possibly 
be construed as a fonn of "vv*iretapping, " an invasion 
of privacy to which the United States is particularly 
sensitive at the present time* 

Several of the participants at this meeting 
expressed the feeling that the user must be told 
that he is being monitored and perhaps given the 
option to cut off the monitoring operation* In 
SPIRES, for example, there is a command available that 
will avoid ti\e monitoring operation, although £ew 
users knov/ of its existence* Even if the user gives 
his permission, of course, we are faced with the 
problem that the user who knows he is being monitored 
may not behave in exactly the same way that he would 
if he were unaware of the monitoring operationv 

Clearly, a system should be allowed to do certain 
types of monitoring, even if the monitoring is restricted 
to purely statistical siommaries. The question is i 
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"What, if anything, can legitimately be monitored 
and what cannot?" It seems clear that the system 
should be allowed to monitor, record and analyze 
statistical aspects of use. Such statistical data 
will include number of uses, number of uses per 
terminal, distribution of search sessions by connect 
time and by CPU time expended, number of simultaneous 
users, and so on* Data of this type are actually 
reduced to the form of graphs, which can be displayed 
on a CRT, in the RIQS system. Where the on-line 
user is required to identify himself by any category 
(e*g, , experienced versus inexperienced), or to identify 
his search by type or purpose, it seems legitimate 
to include these variables in any overall data 
collection activity . 

In fact, it seems reasonable that the system 
should be allowed to monitor, and collect data on, 
the behavior of users in the aggregate , as opposed 
to collecting data on the individual , identifiable 
user* Such aggregate data would include statisitics 
on how the system is used (e*g,, data on frequency 
of term usage, frequency of command usage, and 
frequency with which particular sources are retrieved) 
and the types of problems encountered (e,g., frequency 
of use of various error messages, frequency of use 
of the HELP command, and the specific form of help 
requested, and the frequency and type of use made 
of the EXPLAIN command)* This type of "aggregate" 
monitoring certainly seems justifiable and is of 
great importance to system managers in showing 
how the system is used and what might be done to 
improve its per f ormance . 

The system is on more dangerous ground, perhaps, 
if it monitors the individual user* Monitoring of 
an individual user, either by observing him directly 
from a second terminal or by recording his dialog 
with the system for later analysis, would certainly 
have potential value, A detailed analysis of steps 
used in searches could identify the types of problems 
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that appear to be most prevalent and could indicate 
which system aids are used successfully and which 
are little used or used poorly. From this type of 
analysis, at the level of the individual search 
strategy, a great deal could be learned that might 
be used in the later improvement of the system 
itself or in the improvement of methods of training 
users • 

Unfortunately, it is this type of monitoring 
that is most likely to be construed as an invasion 
of privacy. The problem Is likely to be more 
important in certain situations than in others. In 
an academic environment monitoring might be regarded 
as less objectionable than it would be in an industrial 
environment. Users of conimercial services * who are 
paying for use of the system* may be more opposed 
to such monitoring than users of non-commercial 
services, especially a service provided to the user 
at no direct cost to him. 

In many systems* however, the terminal Is 
identifiable but the user himself is not. For example* 
the log-in operation in MEDLINE identifies the user 
organization only. Many individual users may search 
the system from a particular terminal during the 
course of a day. These users are essentially 
anonymous because the system does not require them 
to identify themselves by name (unless they happen 
to request an off-line printout). It is difficult 
to see why an anonymous user should object to being 
monitored in many situations. The industrial 
situation, where a user may not want anyone else 
to know the subject matter of his current interests / 
is a different situation. 
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It was reported by the representative of 
LEADER that this system routinely monitors users 
at Lehigh University and has even gone as far as 
to contact users and make suggestions as to how they 
might improve their use of the system, based on staff 
analysis of search approaches used. Ho objections 
to these activities have been raised by users at 
Lehigh . 

It would seem desirable, however# for an 
operating system to obtain some blanket approval 
of its monitoring operations. In some situations, 
for example, it should be possible to include a 
•'permission to monitor" in any agreement made 
between the system and a particular organizational 
user. 
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SUMMARY 



I believe this workshop was extremely valuable 
in bringing together, for constructive discussions, 
representatives of jnost of the leading on-line 
bib?iographic systems in use in the United States 
in ,973, Although several of these systems are 
commercial competitors, the meetings were held in 
an atmosphere of mutual respect and a considerable 
amount of valuable communication was achieved, 
A remarkable level of ayreeioent was reached on 
the desirable features and characteristics of on-line 
systems of this type. 

If the workshop had any weak points, it was the 
fact that it concentrated largely on existing systems 
and present capabilities. Very little time was 
devoted to consideration of future systems and future 
capabilities. In the summary session, chaired by 
Parker, possible future trends were considered, 
but not in any great depth. The major thrust of 
this discussion was in the area of system interfaces. 
It was generally agreed that on-line information 
retrieval systems should not be regarded as completely 
independent entities. They wiM be required in the 
future to interface with other systems. Such inter- 
faces will be with other bibliographic systems, 
raising prcblems of compatibility and convertibility 
of vocabularies and searching strategies, and with 
other types of systems (e,g,/ interfaces with 
statistical packages, with text editing systems, 
with photocomposition systems, with CIM and COM 
systems) , System interface design may be one of 
the most challenging problems facing us in the years 
to come. 
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APPENDIX 2 
SYSTEM SNAPSHOTS 



ORBIT 
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1. large bibliographic data bases (ERIC, CHEMCON, 
NLM ) 

2. no full text search capability at present 

3. on the Tymshare network 

4. used primarily by intermediaries or 
intermediary-end user teams 

5. primarily accessed by non-CRT terminals 

6. acts as a service center as well as selling 
software 

7. terminal tends to be in a central location 

8. commercial rates 

9. PL-1, with some assembly language 
10. 360/50 and 370/155 



DIALOG 



1. large bibliographic data bases (NTIS, ERIC, 
Pandex) 

2. full text search capability 

3. on the Tymshare network 

4. iised primarily by intermediaries or 
intermediary -end user teams 

5. primarily accessed by CRT terminals but 
some use of teletypes 
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6t acts primarily as a service center 

7, terminal tends to be in a centi'al location 

8. conunercial rates 

RECON 

1, large bibliographic data bases (TOXICON, NASA ) 
and management information data bases 

2, full text search capability 

3, on the Tymshare network 

4, used primarily by intermediaries or the 
end user 

5, primarily accessed by non-CRT terminals 

6, acts as a service center as well as selling 
software 

7, terminal tends to be in a central location 
but can be user's personal terminal 

8, commercial rates 

9, assembly language 

10, 360/370 series, models 40, SOf 65, 155 
STAIRS 

1, large textual data bases (Eng Index, 
Suny BCN, Congress ) 

2, designed primarily for full text 

3, international access by IBM for in^house use 

4, used both by intermediaries and end users 
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5, primarily accessed by CRT terminals? some 
teletype 

6* intent is to sell hardware not service center 

7. terminal tends to be in a central location 
but can be user's personal terminal 

8. outside IBM charging is up to the buyer 

9. assembly language 
10. 360 and 370 series 

DATA CENTRAL 

1. largo textual data bases (EPA) and management 
information data bases 

2. designed primarily for full text 

3. users access system remotely 

4. used primarily by end users but with some 
data bases has been used by intermediaries 

5. primarily accessed by CRT terminals 

6. intent is to lease software, not service 

7. terminals tend to be personal 

8 . commercial rates 

9* primarily assembly language; some COBOL 

10. 360/40 and larger configurations 
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BASIS- 70 



ERIC 



1. large textual data bases (NTIS, CHEMCON) 
and management information data bases 

2. no full text capability at present 

3. on the Tymshare network 

4. used primarily by intermediaries or end 
user 

5. primarily accessed by CRT terminals (3000 baud) 

6 . intended for service to in-house intermediaries 
7* terminals tend to be personal 

8* outside the institute^ commercial rates apply 

9. primarily FORTRAN; some CDC assembly language 

10. CDC 6000 series 



LEADER 



1. large bibliographic data bases (Eng Index^ 
Chemical Condensates) 

2. no full text search capability in the 
conventional sense 

3. used by both on-campus and off-campus users 

4. used primarily by end users or end-user 
intermediary teams 

5. primarily accessed by CRT terminals 

6. acts as a service center 
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7. 
8. 
9. 
10. 

SPIRES 
1. 

2. 
3. 

4. 
5. 
6. 

7. 

8. 
9. 
10. 
NASIS 

1. 
2. 



terminals tend to be in a central location 
outside the university commercial rates apply 
FORTRAN 
CDC 6400 • 



primarily management information data bases 
but one large bibliographic data base (MARC) 

no full text search capability 

use restricted to terminals on Stanford 
campus at present^ although it is possible 
to access remotely 

used primarily by end users 

primarily accessed by non-CRT terminals 

intent is in-house service although both 
distributing software and considering service 
center role 

terminals split between personal and in a 
central location 



minimal charges 



360/67 and 360/91 



large photographic description data base 
(ERTS) and management information data base 

full text search is possible 



ERIC 
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3. primary access is via the federal telephone 
network 

4- used both by end users or intermediaries 

5. primarily accessed by non-CRT terminals; 
can be accessed by CRT terminals up to 
1200 baud 

6* acts as a service center 

7. terminals tend to be personal 

8. no charging algorithm 

9. PL-1 

10. 360/50 is minimum configuration needed 



RIQS 



1. personal data bases — both bibliographic 
and management information 

2- there are no inverted indexes and no full 
text search 

3. use restricted to faculty and staff at 
Northwestern University 

4. used primarily by end users 

5. primarily accessed by non-CRT terminals 

6* intended for in^house searching 

7* terminals are both in a central location 
and in personal locations 

8^ no charging algorithm 
10. CDC 640 0 
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(in transition from experimental to operational 
status) 

1. small bibliographic data base (INTREX) 

2. full text search capability exists 

3. use restricted to MIT campus at present, 
but can be accessed remotely 

4. used primarily by end users or end user- 
intermediary teams 

5. primarily accessed by CRT terminals 

6 . intended for in-house searching 

7. terminals aro in a central location but can 
be personal 

8 . no charging algorithm 
9. 

10. at present being reprogrammed for 370/165 
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APPENDIX 3 



SYSTEM REPRESENTATIVES 



Dr» Louis Stern 

Center for the Information Sciences 

Lehigh University 

Bethelehemi Pennsylvania 18015 



(215) 867-5071 x323 



Mr» Larry Stevens 
Informatics , Inc. 
6000 Executive Boulevard 
Rockville, Maryland 20852 



(301) 770-3000 x217 



Mr, Richard Giering 

Mead Technology Laboratories 

Research Park 

Dayton, Ohio 454 32 



(513) 426-3111 



Dr, Richard Marcus 



(617) 253-1000 X2340 



Project intrex 35-406 

Massachusetts Institute of Technology 

Cambridge, Massachusetts 02139 

Dr. Charlei^ Goldstein (216) 433-4000 x6660 

Coi-nputerised Information Systems Office 
National Aeronautics and Space 

Administration 
Lewis Research Center 
Cleveland, Ohio 44135 

Dr. Roger Sununit (415) 493-4411 x45034 

Mr. Mark Radwin " x45769 

Lockheed Palo Alto Research Laboratory 
Palo AltOr California 94304 

Mr, Donald Black (213) 393-9411 x7513 

System Development Corporation 

2500 Colorado Avenue 

Santa Monica, California 90406 
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Mr. Dave Colombo (614) 299-3151 x3240 

Battelle Memorial Institute 
505 King Avenue 
Columbus , Ohio 4 3201 



Mr. Stan Friedman (914) 765-2123 

International Business Machines Corp. 
AoTionk, New York 10504 



Mr. Larry Rosen (415) 321-2300 x4531 

Stanford Computation Center 
116 Polya Hall 
Stanford University 
Stanford , California 9 4305 

Dr. Benjamin Mittjnan (312) 492-3682 

Mr. Wayns ^ominicK 

Vogelback Computing Center 

Northwestern University 

2129 Sheridan Road 

Evans ton, Illinois 60201 



VISITORS 



Dr. E>ennis Fife 
National Bureau of Standards 
Connecticut Ave. & Van Ness, Ni 
Washington, D. C. 



Mr. Jim DeiRossi 
National Bureau of Standards 
Connecticut Ave. & Van Ness, N.W. 
Washington , D . C . 

Mr. Richard Lee 

Office of Science Information Service 
National Science Foundation 
1800 G Street, N.W. 
Washington, D. C. 20550 

Prof. David Thompson 
Industrial Engineering Department 
Stanford University 
Stanford, California 9430S 



(202) EM 3-4040 



(202) EM 3-4040 



(202) 632-S818 



(415) 321-2300 x4474 
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REQUEST SETS * 



DICTIONARY ACCESS * 



RELATED TERM 
CAPABILITY 

HIERARCHICAL 
THESAURUS 

CAPABILITY OF INCOR- 
PORATING SYNONYM 
TABLES OR TERM 
HIERARCHIES 

SEARCH FIELD CONTROL 



BOOLEAN OPERATORS * 

WORD PROXIMITY 
OPERATORS 

ARITHMETIC OPERATORS 



SUFFIX REMOVAL * 



PHRASE DECOMPOSITION 



DATA BASE PARTITIONING 



SEQUENTIAL SEARCHING 



PROFILE SEARCHING 
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SEARCH REVIEW * 

PREDEFINED FOR^yVTS * 
FIELD SPECIFICATION * 
RAPID SCAN 

HIGHLIGHTING 

EXPANDING 

SORTING 

RANKING 

COMPUTING 
MICROFICHE 
OFF-LINE PRINTING * 

DISPLAY OF GRAPHS 

BATCH RETRIEVAL 

RANDOM CITATION SELECTION 
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MANUALS (A*) 
SYSTEM 
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CLASS 
PERSONAL 
READING _ 
ON-LINE TRAINING 
DATA BASE OVERVIEW 
SAMPLE SEARCHES 



ON-LINE 
DOCUMENTATION * 

SEARCH LOGIC TRACING 



LIVE HELP * 
VEST POCKET CARD * 
COMMENTS 
MONITOR LOG 
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